middleware:Glite/cream
Contents |
Introduction
|
| ||
|
The gLite middleware itself is a complex system with interconnected parts, interacting over the network. This includes as the middleware to store a data (dCache Storage Element (SE)) as cluster resources (Worker Nodes, Local Resource Management System, NFS server). Every gLite instance has Computing Element as a frontend for job submission. All connections need to pass a generic interface to the cluster (Grid Gate). Information Service (IS) or "site BDII" provides information about the Grid resources and their status which can be used for monitoring and accounting. Current D-Grid gLite implementation uses Globus Monitoring & Discovery Service (MDS) for resource discovery and to publish the resource status. |
| |
|
Note from ticket 614: WARNING: It is possible that user files found in the user accounts home folders get deleted by a gLite cleanup cronjob if the Sites used gLite together with some other middleware (Globus, Unicore or both). The problematic cronjob is:/etc/cron.d/cleanup-grid-accounts D-Grid uses shared instead of poolaccount accounts with fixed user mappings, hence the problematic cronjob should be disabled. This cronjob creates logfiles: /var/log/cleanup-grid-accounts.log* /opt/lcg/etc/cleanup-grid-accounts.conf The recommended solution is to completely disable that cronjob in the gLite CE. Please be aware that:
| ||
GLite CREAM CE
Prepare
- Operating system
- Scientific Linux v.5.6 64 bit
| This is the grid middleware package installation procedure. Please prepare the cluster. |
| This middleware will be installing from UMD repo |
Optimizing the configuration:
Use minimal operating system installation without firewall. To verify installed packages use the command
-
rpm -qa | grep package_name
Install the following additional packages:
-
yum -y install wget yum rpm make gcc gcc-c++ tar sed zlib openssl
After the installation is complete, turn off any unnecessary services (like gpm, sendmail, cups, haldaemon, messagebus, pcmcia, anacron, atd) with the following command:
-
chkconfig <SERVICE> off
Configure the following settings for the server:
The supported installation method for SL5 is the yum tool, and you have to configure yum repositories yourself and install the meta packages using your preferred way.
| Please note that YAIM IS NOT SUPPORTING INSTALLATION |
- Additional Software
- UMD-repo
- Torque Client
administrator's script: prepare.sh
#!/bin/bash# prepare gLite CREAM CE to install#----------------------------------------------------- configure external repositories# remove the EPEL/UMD repo data if availablerm /etc/yum.repos.d/UMD* /etc/yum.repos.d/epel*
# install epel reporpm -Uhv http://ftp-stud.hs-esslingen.de/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
# alternative: rpm -Uhv "http://mirror.scc.kit.edu/index.php?dir=dgiref-check/other/&file=epel-release-5-4.noarch.rpm"# install yum-prioritiesyum install yum-priorities# install umd repo configurationrpm -Uhv http://repository.egi.eu/sw/production/umd/1/sl5/x86_64/updates/umd-release-1.0.2-1.el5.noarch.rpm
yum clean allyum install xml-commons-apis# torque > 2.5.7 should be used# Create a user edgusergroupadd edguser
useradd -g edguser -d /localhome/edguser -s /bin/bash edguser
Install
Install CREAM and BDII:
- Setup meta-packets emi-bdii-site & emi-cream-ce from umd repo
- Setup CA from umd repo
administrator's script: install.sh
#!/bin/bash# install gLite CREAM CE & Site-BDIIyum -y install emi-torque-client.x86_64 emi-torque-utils.x86_64
yum -y install emi-bdii-site emi-cream-ce
yum -y install ca-policy-egi-core
Configure
CREAM CE has an yaim-based configuration system. It means that necessary adapt several config files on target system after installation of cream rpms and run after that the configuring process. For comlite installation need customization next files in /opt/glite/yaim/examples/siteinfo/ folder:
site-info.def
example of site-info.def for Reference Installation
The most important config file in CREAM CE is site-info.def. Here you should describe your CREAM CE configuration and services (such as location Dcache server, acepted VOs, site BDII configuration, Batch conf dir, etc...). System administrators are free to choose the configuration structure they prefer. It's possible to keep all the configuration variables in big site-info.def.
| Please notice to Mandatory site-info.def CREAM-CE variables |
users.conf and groups.conf
example of users.conf for Reference Installation
example of groups.conf for Reference Installation
| you can use the script to generate the users.conf file from list of VOs |
The file users.conf must be created or adapted for all VOs users. During the configuration, the YAIM configuration tool creates these users if they are not exist yet. If the user accounts already exist YAIM do not change the UIDs/GIDs. The entries are controlled in the directory /etc/grid-security/gridmapdir.
groups.conf shuold use for determination permitted groups for CREAM CE
wn-list.conf
example of wn-list.conf for Reference Installation
This file should contain related list of domain names of worker nodes in cluster
| you can use information from qnodes -a command for create wn-list.conf |
BDII
"The Berkeley Database Information Index (BDII) consists of a standard LDAP database which is updated by an external process." [1].
BDII provides information about Grid system, related VOs, storage system, path to VO directory into storage system, batch system, and other information. For configuration site BDII together with Cream CE must be specified several options into site-info.def
Mandatory BDII specific variables table
| Variable Name | Description | Value type |
|---|---|---|
| BDII_REGIONS | List of host identifiers publishing information to the BDII. For each item listed in the BDII_REGIONS variable you need to create a BDII_<host-id>_URL variable</host-id> | node-type name |
| BDII_<host-id>_URL | URL of the information producer (e.g. BDII_host1_URL="ldap://host1_hostname:2170/mds-vo-name=resource,o=grid". Where host1 is a host where several node types may be installed, for example a lcg CE and a site BDII. It's therefore not necessary to create one variable per node type, but per host) | URL(*) |
| SITE_DESC | Long format Name of your site | "A long format name of your site" |
| SITE_LOC | Location of the site BDII | "City, Country" |
| SITE_OTHER_GRID | to separate values. | Grid project name |
| SITE_OTHER_* | For more details, please visit https://wiki.egi.eu/wiki/MAN1_How_to_publish_Site_Information | SITE_OTHER_GRID="WLCG or EGI" |
| SITE_SECURITY_EMAIL | Contact email for security | e-mail address |
| SITE_SUPPORT_EMAIL | The site user support e-mail address as published by the information system | e-mail address |
| SITE_SUPPORT_SITE | Support entry point. Unique Id for the site in the GOC DB and information system | my-bigger-site.their-domain |
| SITE_TIER | Site tier | URL |
| SITE_WEB | Site web site | TIER 1 or TIER 2 |
Publishing dCache information in site BDII (optionaly)
For this step should be installed dcache-server (on BDII host) with only info-provider configuration where info-provider server is real dCache server.
Follow up this bash commands:
#define dCache info-provider server dcache_infoprovider_node="dgiref-dcache.fzk.de"
#install dcache-server yum install dcache-server
#configure for info-provider echo " dcache.layout=bdii_layout info-provider.http.host=$dcache_infoprovider_node ">/etc/dcache/dcache.conf
##bdii_layout.conf should be empty echo -n >/etc/dcache/layouts/bdii_layout
#copy info-provider.xml from dcache info-provider server scp root@$dcache_infoprovider_node:/etc/dcache/info-provider.xml /etc/dcache/info-provider.xml
#create copy or symlink for dcache-info-provider requester. /var/lib/bdii/gip/provider/dcache-info-provide should be executable
cp /usr/sbin/dcache-info-provider /var/lib/bdii/gip/provider/ #for test just start binary /var/lib/bdii/gip/provider/dcache-info-provider and you will see what kind of information you get from dCache
Start yaim configuration
After all steps could start the yaim configuration process:
- Execute yaim for BDII
- Execute yaim for creamCE & TORQUE utils services
administrator's script: configure.sh
# hostcerts# ll /etc/grid-security/# -r--r--r-- 1 root root 2092 Sep 13 15:42 hostcert.pem# -r-------- 1 root root 1679 Sep 13 15:42 hostkey.pem# site-info.def customizationcp /opt/glite/yaim/examples/siteinfo/site-info.def /opt/glite/yaim/site-info.def
cp /opt/glite/yaim/examples/groups.conf /opt/glite/yaim/etc/groups.conf
cp /opt/glite/yaim/examples/users.conf /opt/glite/yaim/etc/users.conf
cp /opt/glite/yaim/examples/wn-list.conf /opt/glite/yaim/etc/wn-list.conf
# Since the site-info.def file contains passwords, it should NOT be readable for users!chmod 600 /opt/glite/yaim/site-info.def
chmod 700 /opt/glite/yaim
chmod 777 /opt/glite/var/
# configure site BDII/opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/site-info.def -n glite-BDII_site
# For Torque (if the CREAM CE is NOT Torque master):/opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/site-info.def -n creamCE -n TORQUE_utils
# if you use old version of log parser uncomment this#/opt/glite/yaim/bin/yaim -r -s /opt/glite/yaim/site-info.def -n creamCE -f config_cream_blparser
Proceed
Add to startup config Cram & Tomcat:
- chkconfig tomcat5 on
- chkconfig gLite on
Start the software immediately:
- service tomcat5 start
- service gLite start
administrator's script: proceed.sh
service tomcat5 restart
service gLite restart
Initial test
To quick verify that the Cream service is working fine could use from client side next order:
- Create a test .jdl script. For more information about .jdl script use this CREAM jdl guide
- Grid user initialization with VOMS sign
- For submit job use glite-ce-job-submit
- For get information about job use glite-ce-job-status
You test will passed when status will DONE-OK of executed job.
For quick test your batch system use echo "/bin/hostname" | qsub and qstat -a
administrator's script: test.sh
#!/bin/bashcream_server="dgiref-glite.fzk.de"
pbs_queue_name="dgiseq"
echo'
Executable = "/bin/hostname";stdOutput = "stdout";stdError = "stderr";'> test.jdl
voms-proxy-init --voms dgopsjob_id=`glite-ce-job-submit -a -r dgiref-glite.fzk.de:8443/cream-pbs-dgiseq test.jdl`
sleep 30;
glite-ce-job-status $job_id#for test bdii execute ldapsearch -xLLL -b o=grid -p 2170 -h $bdii_host