middleware:Glite/31

From Dgiref
Jump to: navigation, search

Contents

Introduction

Glite.jpg The gLite is a Service Oriented Grid lightweight middleware providing services for managing distributed computing and storage resources and the required security, auditing and information services. A grid running under the gLite middleware consists of several sites providing computing and storage resources, "interconnected" by a common information system and some shared services.
Simplified gLite 3.1 architecture

The gLite middleware itself is a complex system with interconnected parts, interacting over the network. This includes as the middleware to store a data (dCache Storage Element (SE)) as cluster resources (Worker Nodes, Local Resource Management System, NFS server).

Every gLite instance has Computing Element as a frontend for job submission. All connections need to pass a generic interface to the cluster (Grid Gate).

Information Service (IS) or "site BDII" provides information about the Grid resources and their status which can be used for monitoring and accounting.

Current D-Grid gLite implementation uses Globus Monitoring & Discovery Service (MDS) for resource discovery and to publish the resource status.


Package:    glite CREAM CE
 os:             Scientific Linux version 5.6 64 bit
 server:        dgiref-glite.fzk.de
 manuals:   glite CREAM
 monitoring: monitoring page


Information links
Download links
Guidelines links
Files links


Note from ticket 614:

WARNING:

It is possible that user files found in the user accounts home folders get deleted by a gLite cleanup cronjob if the Sites used gLite together with some other middleware (Globus, Unicore or both). The problematic cronjob is:
/etc/cron.d/cleanup-grid-accounts
which is intended to keep the gLite poolaccounts empty and reusable by other Grid users.

D-Grid uses shared instead of poolaccount accounts with fixed user mappings, hence the problematic cronjob should be disabled. This cronjob creates logfiles:

/var/log/cleanup-grid-accounts.log*
and its action can be controlled with the
/opt/lcg/etc/cleanup-grid-accounts.conf

The recommended solution is to completely disable that cronjob in the gLite CE. Please be aware that:

  • cronjob entry will be recreated by YAIM if you reconfigure your node! (see /opt/glite/yaim/functions/config_users)
  • if you installed the *cluster nodes* (ie, WNs) gLite software by using RPMs instead of the tarball package, then all the WNs will also be affected and running those cleanup cronjobs.

Note-icon.png
  
The gLite CE require the user edguser in the MAUI configuration for the Torque Server.


Please open a NGI-DE ticket if you experience any Installation or Configuration problem.

gLite server v.3.1

Prepare

Software
  • Scientific Linux version 4.8 32 bit
  • Java JDK >= 1.6.0
  • perl
  • Torque Client

Optimizing the configuration:


Use minimal operating system installation without firewall. To verify installed packages use the command

  • rpm -qa | grep package_name

Install the following additional packages:

  • yum -y install wget yum rpm make gcc gcc-c++ tar sed zlib openssl

After the installation is complete, turn off any unnecessary services (like gpm, sendmail, cups, haldaemon, messagebus, pcmcia, anacron, atd) with the following command:

  • chkconfig <SERVICE> off

Configure the following settings for the server:

The supported installation method for SL4 is the yum tool, and you have to configure yum repositories yourself and install the meta packages using your preferred way.

Note-icon.png
  
Please note that YAIM IS NOT SUPPORTING INSTALLATION
  • Download the following repo files into the /etc/yum.repo.d:
    • jpackage.repo
    • lcg-CA.repo
    • lcg-CE.repo
    • glite-TORQUE_utils.repo


Firewall configuration

The LCG/gLite frontend runs the LCG CE and Site-BDII services. To enable the communication, check the following ports (how to open port in firewall):

Service Incoming ports (TCP) Differs from default configuration
GRAM Gatekeeper + Jobmanager 2119 No
Globus port-range (Jobmanager, GridFTP) 20000-25000 No
BDII 2170 No
GridFTP 2811 No

Install

The D-Grid reference installation uses the LCG CE variant for the gLite computing resources. Hence the following three main gLite components must be installed on the CE (Computing Element):

  1. Computing Element: lcg-CE package
  2. Information system: glite-BDII package
  3. Batch system components: glite-TORQUE_utils package


Configure

Note-icon.png
  
To install the gLite Monitoring services (BDII and RGMA), please refer to gLite services page.

Generally speaking the gLite configuration done by the YAIM packages (for the YAIM description check YAIM guide). There are three important site-specific configuration files:

The files structure description can be found: into the /opt/glite/yaim/examples/ (for example users.conf.README). The file users.conf must be created or adapted for all VOs users. During the configuration, the YAIM configuration tool creates these users if they are not exist yet. If the user accounts already exist YAIM do not change the UIDs/GIDs. The entries are controlled in the directory /etc/grid-security/gridmapdir.

Certificates

The certificate installation procedure can be done by the two ways:

  • Use the apt savannah.fzk.de repository. Examples:
  • install the fzk-vomscert package from the apt repository:
 
        rpm savannah.fzk.de repository/fzk security
        cat << EOF > /etc/apt/sources.list.d/fzk.list 
              ###
              ### FZK apt repository containing some packages needed for DGrid
              ###   Currently these are the VOMS server certificate, and the GridKa-CA
              ###   configuration rpms. Do not remove this repository.
              ###
              rpm http://savannah.fzk.de repository/fzk security
 
        EOF
 
        apt-get update
        apt-get install fzk-vomscert
  • GSI configuration. Install the ca_FZK-local package from the following apt repository:
     rpm savannah.fzk.de repository/fzk security
  • Use the d-grid download area (see the following script)


Proceed

The gLite instance is started automatically.

Note-icon.png
  
To make available stagein/stageout options for PBS Jobs, the /etc/ssh/shosts.equiv and /etc/ssh/ssh_known_hosts should be distributed from gLite into all worker nodes. The reference installation use the cfengine to implement such a procedure (use link).

Initial test

Examine the newly installed system by the following commands:

Update

Updates to gLite 3.1 are released regularly. It is enough to execute yum update to update the instance.

WARNING: Several sites use auto update mechanism. Sometimes middleware updates require non-trivial configuration changes or a reconfiguration of the service. This could involve database schema changes, restart service, new configuration files, etc, which makes it difficult to ensure that automatic updates will not break up a service. Thus NOT TO USE AUTOMATIC UPDATE PROCEDURE OF ANY KIND!

gLite services

Top Level

There are some "top level" services, provided for interactions between providers sites and users.

  1. "top level BDII" - Berkeley Database Information Index is a Lightweight Directory Access Protocol server which collect the data from the sites information services.
  2. Replica Catalog - keep track where data are stored.
  3. Workload Management System (WMS) - a set of Grid middleware components responsible for the tasks distribution and management across Grid resources, in such a way that applications are conveniently, efficiently and effectively executed. The core component is the Workload Manager (WM), whose purpose is to accept and satisfy requests for job management coming from its clients.

Site-level

BDII

to install BDII, use the following procedure:

  • download the glite-BDII.repo
  • install package glite-BDII with yum
  • configure site-info.def
cd /etc/yum.repos.d
 
wget http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/glite-BDII.repo
 
yum -y install glite-BDII
 
## glite site BDII 
# For the configuration of the Site BDII following variables have to be set in the site-info.def:
 
CE_HOST
DCACHE_ADMIN ## when using dCache as SE
SITE_EMAIL
SITE_LAT
SITE_LONG
SITE_NAME 
BDII_HOST
SITE_BDII_HOST
BDII_REGIONS
BDII_CE_URL
BDII_SE_URL ## when a SE is available on the Site
 
# The configuration is done by
 
/opt/glite/yaim/bin/yaim -c -s "site-info.def" -n BDII_site

MON-RGMA

Note-icon.png
  
The current Reference Installation does not use any of the services provided by the MON component, in particular it doesn't make use of RGMA at all. If your site needs it, use the following documentation.

to install the MON component, use the following procedure:

  • download the file glite-MON.repo
  • install the package glite-MON with yum
  • configure site-info.def


cd /etc/yum.repos.d
 
wget http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/glite-MON.repo
 
yum -y install glite-MON
 
## glite MonBox 
# For the configuration of the MonBox following variables have to be set in the site-info.def:
 
APEL_DB_PASSWORD
CE_HOST
GRIDICE_SERVER_HOST
MON_HOST
MYSQL_PASSWORD
SITE_NAME
SITE_BDII_HOST 
 
# The configuration is done by
/opt/glite/yaim/bin/yaim -c -s "site-info.def" -n MON

Attribute-based authorization

The Attribute-based authorization is already a part of the gLite user administration and includes only the configuration of the /opt/glite/yaim/etc/groups.conf file.

release:/glite/yaim/etc/groups.conf

JavaGAT

Regarding security, the gLite adaptor behaves mostly like Globus. The difference between Globus Tookit and gLite, is that instead of an entirely self-signed proxy, gLite uses so-called VOMS proxies for authentication and authorization.

  1. locate the personnel certificates files userkey.pem and usercert.pem in the directory $HOME/.globus
  2. locate the host certificates of the Grid hosts you like to access in the directory $HOME/.globus/certificates.
  3. The dataset $HOME/.globus/cog.properties should exists and to be like:
cat $HOME/.globus/cog.properties
 
#Java CoG Kit Configuration File
#usercert: The path to the file containing your dgrid certificate.
usercert=/home/dgdt0000/.globus/usercert.pem
# userkey: The path to the file containing your Grid key.
userkey=/home/dgdt0000/.globus/userkey.pem
# proxy: The name under which your proxy certificate which you create with grid-proxy-init is stored.
proxy=/tmp/x509up_u1000
#cacert: The path of the directory, which contains the host certificates.
#cacert=/etc/grid-security/certificates
cacert=/home/dgdt0000/.globus/cog-certificates

To be able to make the VOMS-proxy request on behalf of the user, the gLite adaptor needs to know a few additional pieces of data:

  1. The name of the VO for which the user wants to obtain a credential (e.g. dgtest)
  2. The endpoint of the VOMS server webservice (this address is usually different to the URL at which the VOMS admin can be accessed with a browser)
  3. The port at which the VOMS server is listening to requests
  4. The distinguished name (DN) of the VOMS Host. If you are unsure about this, you can usually find the information on the "Configuration" page in the VOMS admin server application.

An example configuration of all the necessary parameters for the gLite adaptor could look as follows:

GATContext context = new GATContext();
CertificateSecurityContext secContext =
        new CertificateSecurityContext(
                   new URI("/home/dgdt0000/.globus/userkey.pem"),
                   new URI("/home/dgdt0000/.globus/usercert.pem"),
                   "mysupersecretpwd");
Preferences globalPrefs = new Preferences();
globalPrefs.put("vomsServerURL", "skurut19.cesnet.cz");
globalPrefs.put("vomsServerPort", "7001");
globalPrefs.put("vomsHostDN", "/DC=cz/DC=cesnet-ca/O=CESNET/CN=skurut19.cesnet.cz");
globalPrefs.put("VirtualOrganisation", "voce");
context.addPreferences(globalPrefs);
context.addSecurityContext(secContext);


Please open a NGI-DE ticket if you experience any Installation or Configuration problem.

mpi

Prepare

Configuration is necessary on both the CEs (gLite) and WNs in order to support and advertise MPI correctly (see Site configuration for MPI for details). This is performed by the gLite YAIM module glite-yaim-mpi which should be run on both the CE and WNs.


Install

The following packages to install:

  • glite-MPI_utils

Configure

  1. Add the following to the site-info.de of the CE and WNs. see YaimConfig for detailed information.
  2. export set of environment variables to avoid INFO: No MPI flavours enabled.
  3. execute yaim command to configure

WARNING: in /etc/hosts you have to set wn with full hostname, otherwise yaim wont't find hostname -f:hostname wn.fzk.de and the yaim will abort the configuration!!!

After yaim configuration has finished edit /etc/hosts again with wn older hostname, other wise the node will be seen twice as different node wn and wn.fzk.de while reserving nodes for an MPI job.!!!


Initial test

  • You can try submitting a job to your site using the instructions found via the page job submission
  • You can do some basic tests by logging in on a WN as a pool user and running the following:


Personal tools