middleware:Globus/service/GRAM

From Dgiref
Jump to: navigation, search

Contents

WS-GRAM

Configure WS-GRAM

After successful installation of the recommended D-Grid package Globus is aready configured to use PBS (more precisely TORQUE) as the Local Resource Management System (LRMS). The interface for submitting jobs to the LRMS is provided by a component called Scheduler Adapter, which is basically a Perl module called <$GLOBUS_LOCATION>/lib/perl/Globus/GRAM/JobManager/pbs.pm. It should be patched as discribed below.

Patching the Scheduler Adapter

Line 387-388:

    elsif($description->jobtype() eq 'mpi' ||
          $description->jobtype() eq 'multiple')

should be replaced by:

    elsif(
          $description->jobtype() eq 'mpi' ||
           ($description->jobtype() eq 'multiple' and
            ($description->host_count() > 1 or $description->count() > 1) 
           )
         )

At line 408:

            print CMD "#!/bin/sh\n";

should be added:

            print CMD "#!/bin/sh\n";
            print CMD ". /etc/profile";

A more rigorous patch is possible, if the Mpiexec implementation by Pete Wyckoff is installed on the cluster. Note that this version of Mpiexec is not fully compatible with those provided with MPICH and other MPI iomplementations. Most notable, the parameter '-machinefile' does not exist.

Configuring Sudo

In order to submit jobs on behalf of a user Globus needs to be authorized to invoke specific commands via sudo (Super User Do). To this end edit the file /etc/sudoers.

Add the following lines to /etc/sudoers:

#
# Disable "ssh hostname sudo <cmd>", because it will show the password in clear.
#         You have to run "ssh -t hostname sudo <cmd>".
#
# Defaults    requiretty
 
# Globus GRAM entries
 
globus  ALL=(ALL) NOPASSWD: \
    /usr/local/globus/libexec/globus-gridmap-and-execute \
      -g /etc/grid-security/grid-mapfile \
      /usr/local/globus/libexec/globus-job-manager-script.pl *
 
globus  ALL=(ALL) NOPASSWD: \
    /usr/local/globus/libexec/globus-gridmap-and-execute \
      -g /etc/grid-security/grid-mapfile \
      /usr/local/globus/libexec/globus-gram-local-proxy-tool *


References
Pre-WS GRAM configuration
  • As root user create the /etc/xinetd.d/gsigatekeeper file and insert the following data:
service gsigatekeeper 
{ 
socket_type = stream 
protocol  = tcp 
wait  = no 
user = root 
env  += LD_LIBRARY_PATH=<$GLOBUS_LOCATION>/lib 
env += GLOBUS_TCP_PORT_RANGE=20000,25000 
server = <$GLOBUS_LOCATION>/sbin/globus-gatekeeper 
server_args = -conf <$GLOBUS_LOCATION>/etc/globus-gatekeeper.conf 
disable = no 
}
  • Restart as root user the xinetd daemon:
$ /etc/init.d/xinetd restart
GRAM tests

To be sure that GRAM accepts jobs, execute as grid-user:

For WS-GRAM
> globusrun-ws -submit -F <FQDN of the Globus Frontend> -s -c /bin/hostname
 Delegating user credentials...Done.
 Submitting job...Done.
 Job ID: uuid:66720d6a-6aac-11dd-82c4-af7ae8031d29
 Termination time: 08/16/2008 09:27 GMT
 Current job state: Pending
 Current job state: Active
 Current job state: CleanUp-Hold
 dgiref-globus.fzk.de
 Current job state: CleanUp
 Current job state: Done
 Destroying job...Done.
 Cleaning up any delegated credentials...Done.
For Pre-WS-GRAM
> globus-job-run localhost:2119/jobmanager-fork /bin/date
Fri Dec 21 10:59:52 CEST 2007
Turn off fork scheduler

To turn off the fork scheduler, rename the following configuration files, and restart the container as root user:

$ cd $GLOBUS_LOCATION/etc/gram-service-Fork 
$ mv  jndi-config.xml jndi-config.xml_save
$ cd $GLOBUS_LOCATION/etc/grid-services
$ mv jobmanager-fork jobmanager-fork.save
$ /etc/init.d/globus-container restart

Disabling the fork scheduler can be tested by a grid user as following:

For WS-GRAM
> globusrun-ws -submit  -c /bin/hostname
Submitting job...Failed.
globusrun-ws: Error submitting job
globus_soap_message_module: SOAP Fault
Fault code: soapenv:Server.userException
Fault string: java.rmi.RemoteException: Job creation failed.; nested exception is:
java.rmi.RemoteException: The Managed Job Factory Service at 
https://10.156.10.69:8443/wsrf/services/ManagedJobFactoryService does not have a resource with key "Fork".
 
> globusrun-ws -submit -Ft Fork -c /bin/hostname
Submitting job...Failed.
globusrun-ws: Error submitting job
globus_soap_message_module: SOAP Fault
Fault code: soapenv:Server.userException
Fault string: java.rmi.RemoteException: Job creation failed.; nested exception is:
java.rmi.RemoteException: The Managed Job Factory Service at 
https://10.156.10.69:8443/wsrf/services/ManagedJobFactoryService does not have a resource with key "Fork".
For Pre-WS-GRAM
> globus-job-run localhost:2119/jobmanager-fork /bin/date
GRAM job submission failed because the gatekeeper failed to find the requested service (error code 93)
Personal tools