middleware:Globus/service

From Dgiref
Jump to: navigation, search

Contents

Introduction

The Globus Toolkit service components can be separated into the following groups:

  1. Web-Service based components (WS-GRAM, RFT, MDS4)
  2. Non-Web-Service based components (GridFTP , GSISSH)

Web-Service components

WS-GRAM

Configure WS-GRAM

After successful installation of the recommended D-Grid package Globus is aready configured to use PBS (more precisely TORQUE) as the Local Resource Management System (LRMS). The interface for submitting jobs to the LRMS is provided by a component called Scheduler Adapter, which is basically a Perl module called <$GLOBUS_LOCATION>/lib/perl/Globus/GRAM/JobManager/pbs.pm. It should be patched as discribed below.

Patching the Scheduler Adapter

Line 387-388:

    elsif($description->jobtype() eq 'mpi' ||
          $description->jobtype() eq 'multiple')

should be replaced by:

    elsif(
          $description->jobtype() eq 'mpi' ||
           ($description->jobtype() eq 'multiple' and
            ($description->host_count() > 1 or $description->count() > 1) 
           )
         )

At line 408:

            print CMD "#!/bin/sh\n";

should be added:

            print CMD "#!/bin/sh\n";
            print CMD ". /etc/profile";

A more rigorous patch is possible, if the Mpiexec implementation by Pete Wyckoff is installed on the cluster. Note that this version of Mpiexec is not fully compatible with those provided with MPICH and other MPI iomplementations. Most notable, the parameter '-machinefile' does not exist.

Configuring Sudo

In order to submit jobs on behalf of a user Globus needs to be authorized to invoke specific commands via sudo (Super User Do). To this end edit the file /etc/sudoers.

Add the following lines to /etc/sudoers:

#
# Disable "ssh hostname sudo <cmd>", because it will show the password in clear.
#         You have to run "ssh -t hostname sudo <cmd>".
#
# Defaults    requiretty
 
# Globus GRAM entries
 
globus  ALL=(ALL) NOPASSWD: \
    /usr/local/globus/libexec/globus-gridmap-and-execute \
      -g /etc/grid-security/grid-mapfile \
      /usr/local/globus/libexec/globus-job-manager-script.pl *
 
globus  ALL=(ALL) NOPASSWD: \
    /usr/local/globus/libexec/globus-gridmap-and-execute \
      -g /etc/grid-security/grid-mapfile \
      /usr/local/globus/libexec/globus-gram-local-proxy-tool *


References
Pre-WS GRAM configuration
  • As root user create the /etc/xinetd.d/gsigatekeeper file and insert the following data:
service gsigatekeeper 
{ 
socket_type = stream 
protocol  = tcp 
wait  = no 
user = root 
env  += LD_LIBRARY_PATH=<$GLOBUS_LOCATION>/lib 
env += GLOBUS_TCP_PORT_RANGE=20000,25000 
server = <$GLOBUS_LOCATION>/sbin/globus-gatekeeper 
server_args = -conf <$GLOBUS_LOCATION>/etc/globus-gatekeeper.conf 
disable = no 
}
  • Restart as root user the xinetd daemon:
$ /etc/init.d/xinetd restart
GRAM tests

To be sure that GRAM accepts jobs, execute as grid-user:

For WS-GRAM
> globusrun-ws -submit -F <FQDN of the Globus Frontend> -s -c /bin/hostname
 Delegating user credentials...Done.
 Submitting job...Done.
 Job ID: uuid:66720d6a-6aac-11dd-82c4-af7ae8031d29
 Termination time: 08/16/2008 09:27 GMT
 Current job state: Pending
 Current job state: Active
 Current job state: CleanUp-Hold
 dgiref-globus.fzk.de
 Current job state: CleanUp
 Current job state: Done
 Destroying job...Done.
 Cleaning up any delegated credentials...Done.
For Pre-WS-GRAM
> globus-job-run localhost:2119/jobmanager-fork /bin/date
Fri Dec 21 10:59:52 CEST 2007
Turn off fork scheduler

To turn off the fork scheduler, rename the following configuration files, and restart the container as root user:

$ cd $GLOBUS_LOCATION/etc/gram-service-Fork 
$ mv  jndi-config.xml jndi-config.xml_save
$ cd $GLOBUS_LOCATION/etc/grid-services
$ mv jobmanager-fork jobmanager-fork.save
$ /etc/init.d/globus-container restart

Disabling the fork scheduler can be tested by a grid user as following:

For WS-GRAM
> globusrun-ws -submit  -c /bin/hostname
Submitting job...Failed.
globusrun-ws: Error submitting job
globus_soap_message_module: SOAP Fault
Fault code: soapenv:Server.userException
Fault string: java.rmi.RemoteException: Job creation failed.; nested exception is:
java.rmi.RemoteException: The Managed Job Factory Service at 
https://10.156.10.69:8443/wsrf/services/ManagedJobFactoryService does not have a resource with key "Fork".
 
> globusrun-ws -submit -Ft Fork -c /bin/hostname
Submitting job...Failed.
globusrun-ws: Error submitting job
globus_soap_message_module: SOAP Fault
Fault code: soapenv:Server.userException
Fault string: java.rmi.RemoteException: Job creation failed.; nested exception is:
java.rmi.RemoteException: The Managed Job Factory Service at 
https://10.156.10.69:8443/wsrf/services/ManagedJobFactoryService does not have a resource with key "Fork".
For Pre-WS-GRAM
> globus-job-run localhost:2119/jobmanager-fork /bin/date
GRAM job submission failed because the gatekeeper failed to find the requested service (error code 93)

RFT

RFT configuration

Configure the PostgreSQL database:

As root user edit the /etc/sysconfig/postgresql file, to ensure that TCP/IP connections (option -i) are allowed:

$ vi /etc/sysconfig/postgresql
# Add: 
POSTGRES_OPTIONS="-i"

As the postgres user initialize the new database:

> initdb -D /var/lib/pgsql/data

As the postgres user allow access to the globus user as follows:

> vi /var/lib/pgsql/data/pg_hba.conf

Add to the end of file:

host[TAB]rftDatabase[TAB]globus[TAB]<IP-addresses of GT4 frontends>[TAB]255.255.255.255[TAB][md5/trust]

Note-icon.png
  
[TAB] stands for Tab input. For security reasons, it is recommended that you have a password protection. If the password protection is desired, md5 is used, otherwise trust will be used. If a password protection is set, you must give a password to every database user.

Restart as root user the PostgreSQL database server:

$ /etc/init.d/postgresql restart

As postgres user create a database ID for the user globus.

> createuser globus

Answer to the next question with "y". The password for the database ID globus will be asked.

Execute The shell script /usr/local/bin/globus-env-setup.sh as postgres user:

> . /usr/local/bin/globus-env-setup.sh

Now create the RFT database as postgres user:

> createdb rftDatabase 
> psql -d rftDatabase -f  $GLOBUS_LOCATION/share/globus_wsrf_rft/rft_schema.sql

As globus user configure the using of the RFT database.

% vi $GLOBUS_LOCATION/etc/globus_wsrf_rft/jndi-config.xml
#If the password protection is active, setup the chosen password 'foo'. Otherwise leave an empty string.
<resource name="dbConfiguration"
            type="org.globus.transfer.reliable.service.database.RFTDatabaseOptions">
            <resourceParams>
            <parameter>
                <name>
                driverName
                </name>
                <value>
                org.postgresql.Driver
                </value>
            </parameter>
            <parameter>
                <name>
                connectionString
                </name>
                <value>
                jdbc:postgresql://dgiref-globus.fzk.de/rftDatabase
                </value>
            </parameter>
            <parameter>
                <name>
                userName
                </name>
                <value>
                globus
                </value>
            </parameter>
            <parameter>
                <name>
                password
                </name>
                <value>
                </value>
            </parameter>
            </resourceParams>
        </resource>

Restart as root user the Globus Container:

$ /etc/init.d/globus-container restart
RFT test

If the RFT service is properly configured, it should be possible to copy a test file as follows:

As globus user copy the file transfer.xfr to a temporary directory like /tmp.

In the file /tmp/transfer.xfr replace the entry localhost by the Globus frontend FQDN and create an empty test file:

% cp $GLOBUS_LOCATION/share/globus_wsrf_rft_test/transfer.xfr /tmp
% touch /tmp/rftTest.tmp

Start the test as grid user:

> rft -h <FQDN> -f /tmp/transfer.xfr

MDS

MDS4 configuration

To show the FQDN instead of the IP address of the Globus frontend in the WebMDS, insert as the globus user the following entries in $GLOBUS_LOCATION/etc/globus_wsrf_core/server-config.wsdd to the <globalConfiguration> tag:

<globalConfiguration> 
 <parameter name="logicalHost" value="<FQDN>"/> 
 <parameter name="publishHostName" value="true"/> 
 ... 
</globalConfiguration>

To register your site to the D-Grid wide MDS Index located at the LRZ in Munich and the redundant MDS Index located at KIT in Karlsruhe, insert as the globus user the following index URLs in $GLOBUS_LOCATION/etc/globus_wsrf_mds_index/hierarchy.xml to the <upstream> tag:

    <config>                                            <config> 
           <upstream>$URL1</upstream>      OR                    <upstream>$URL</upstream>
           <upstream>$URL2</upstream>                   </config>
    </config>

According to the Site Hierarchy and the belonging to one VO the values for URL, URL1 and URL2 must be the following:

For Kerndgrid Sites
NOT new Globus installation in the Site
If a MDS Site-Index with the name "site-index.mysite.de" listen on port 8443 is already available (e.g. more than one Globus Installations in the site):
URL = https://site-index.mysite.de:8443/wsrf/services/DefaultIndexService
NEW Globus installation in the Site
If this is the first Globus installation in the Site, this assumes that the Site Index itself should be registered to the Kerndgrid MDS index hosted at the LRZ and KIT:
URL1 = https://mds-dgi.lrz.de:8445/wsrf/services/DefaultIndexService
URL2 = https://dgrid-mds.scc.kit.edu:8443/wsrf/services/DefaultIndexService
For Globus 4.2 MDS please use:
URL = https://mds2-dgi.lrz.de:8445/wsrf/services/DefaultIndexService

In this case, the geomaint sensor must be also installed and configured. Please specify the appropriate name for your site as registered in the D-Grid Resources Registration Service GRRS in the configuration of the sensor!

For Community Sites
NOT new Globus installation in the Site
If a Site-Index with the name "site-index.mysite.de" listen on port 8443 is already available (e.g. many Globus Installations in the site):
URL = https://site-index.mysite.de:8443/wsrf/services/DefaultIndexService
NEW Globus installation in the Site
If this is the first Globus installation in the Site, that assumes that the Site Index should be registered itself to the Community-Index:
URL = https://index.mycommunity.de:8443/wsrf/services/DefaultIndexService
Please ask the community leader about the correspondent Community-Index address.
NEW Community-Index installation in the community
If this is a first Globus installation in the whole Community, an additional Community-Index should be setup (on this or another computer). This Community-Index register itself direct to the central D-Grid MDS4 Indexes at the LRZ and KIT:
URL1 = https://mds-dgrid.lrz.de:8443/wsrf/services/DefaultIndexService
URL2 = https://dgrid-mds.scc.kit.edu:8443/wsrf/services/TopIndexService
For Globus 4.2 MDS please use:
URL = https://mds2-dgrid.lrz.de:8443/wsrf/services/DefaultIndexService

In this case, the geomaint sensor must be also installed and configured. Please specify the appropriate name for your site as registered in the D-Grid Resources Registration Service GRRS in the configuration of the sensor!

Please send your index address to mab<nospam>d-grid.de so that your site index is visible from the central MDS sites LRZ WebMDS and the redundant KIT WebMDS

Additional information e.g. the list of available Community-addresses can be found here.

MDS4 test

Restart the container as root after previous configuration steps.

$ /etc/init.d/globus-container restart

After around 10 minutes the information about your installation will appear under the LRZ WebMDS and the redundant KIT WebMDS. For Globus 4.2 the information will appear under LRZ WebMDS 4.2

To see the provided monitoring data from a grid client do the following as user:

> wsrf-query -s https://<FQDN>:8443/wsrf/services/DefaultIndexService
Geomaint sensor

The GeoMaint sensor as part of the Globus MonMan incubator project must be installed on every globus Site index server to forward monitoring data to the central MDS Index Service. The site specific monitoring data gathered by the sensor contains the geographical coordinates as well as informations about maintenance and contact persons. These informations are used by LRZ WebMDS, LRZ WebMDS 4.2 and the redundant KIT WebMDS to clarify the current site status in the topology map.

Geomaint can be downloaded from MonMan repository. A guide to the installation can be found here or in the Readme File.

Note-icon.png
  
Current version for Geomaint 1.2.3 sensor is compatible with Globus Toolkit 4.0.x and Globus Toolkit 4.2.x .
vi $GLOBUS_LOCATION/libexec/infoprovider/conf/site.conf
 
# Geolocation
 
site.location=Garching, Deutschland
 
site.latitude=48.26166   #at least 5 decimal symbols in the position
 
site.longitude=11.66638  #at least 5 decimal symbols in the position
 
site.web=http://mabtest.lrz-muenchen.de
 
site.sponsor=BMBF
 
...
 
# configuration of the currently ongoing maintenance work
# 1 stands for a scheduled maintenance work and 
# 2 maintenance now
# 0 represents no current or planned maintenance(running)
 
#site.maintenance=3unconfigured site sensor
 
#site.maintenance=2cluster is down for maintenance until 7pm
 
#site.maintenance=1maintenance today from 5pm to 7pm
 
#site.maintenance=0Running
Ganglia: resource monitoring

Ganglia cluster monitoring is used to monitor individual needed Compute Nodes. It provides parameters such as Operating system RAM size or processor architecture. It is recommended to install Ganglia to the computing nodes to provide cluster information in the MDS4. This enables the grid users and resource brokers to match their requirements of computing resources.

The information about installing Ganglia and the software packages can be found here. A guide to link the cluster Ganglia Monitoring toolkits to the Globus Toolkit can be found here.

To enable the Ganglia information provider for MDS4 execute the following command:

$ mds-gluerp-configure pbs ganglia $GLOBUS_LOCATION/etc/gram-service-PBS/gluerp-config.xml

If a Ganglia monitoring daemon is not installed on the Globus Frontend you must enter the hostname and port where Ganglia is running by editing the file

$ $GLOBUS_LOCATION/etc/gram-service-PBS/gluerp-config.xml

After that you have to restart the Globus Container.

$ /etc/init.d/globus-container restart

Non-Web-Service components

GridFTP

configuration

As root user insert the following data to the /etc/xinetd.d/gsiftp file:

service gsiftp 
{ 
instances  = 100 
socket_type  = stream 
wait  = no 
user  = root
env  += LD_LIBRARY_PATH=<$GLOBUS_LOCATION>/lib
env  += GLOBUS_TCP_PORT_RANGE=20000,25000
server  = <$GLOBUS_LOCATION>/sbin/globus-gridftp-server
server_args  = -i
nice = 10
disable = no
}

Restart the xinetd daemon as the root user:

$ /etc/init.d/xinetd restart
GridFTP test

To be sure that GridFTP can successfully transfer data, execute as grid-user:

> grid-proxy-init
> globus-url-copy gsiftp://localhost/etc/hosts file:///tmp/hosts_copy 
> ls /tmp/hosts_copy

GSISSH

Note-icon.png
  
Firewall configuration
configure

To configure the 'gsissh' start script do the following as 'root':

su root
cp $GLOBUS_LOCATION/sbin/SXXsshd /etc/init.d/gsisshd
chkconfig --add gsisshd

As 'globus' user change the port number from 22 to 2222 in the following files:

su globus
cat $GLOBUS_LOCATION/etc/ssh/ssh_config
cat $GLOBUS_LOCATION/etc/ssh/sshd_config

As 'root' start the 'gsissh' daemon with the following command:

su root
 /etc/init.d/gsisshd start
setting up as service
  • In order to work with the GSI-SSH Service, the following line must be inserted as 'root' in the file /etc/services:
gsissh             2222/tcp
  • In /etc/hosts the IP address of the server must be mapped to the FQDN as follows:
<IP ADDRESS>    <FQDN>    <hostname>
  • Additionaly, insert as 'root' the following entries in the file /etc/hosts.allow:
echo "gsisshd:ALL:ALLOW" >> /etc/hosts.allow
proceed
service gsisshd [start|stop]
Personal tools