data:Dcache/190/server
Contents |
Dcache/190/server
Prepare
- Operating system
- Scientific Linux version 4.5 32 bit
Optimizing the configuration:
Use minimal operating system installation without firewall. To verify installed packages use the command
-
rpm -qa | grep package_name
Install the following additional packages:
-
yum -y install wget yum rpm make gcc gcc-c++ tar sed zlib openssl
After the installation is complete, turn off any unnecessary services (like gpm, sendmail, cups, haldaemon, messagebus, pcmcia, anacron, atd) with the following command:
-
chkconfig <SERVICE> off
Configure the following settings for the server:
- Firewall configuration
The dCache frontend runs SE services (GRIS, GridFTP, SRM). Which ports are used for dCache is essentially not important. But it's advisable when all sites that support dCache are using the same ports, so compatibility is automatically achived in an easy manner. When you are setting up your dCache system (by means of dCacheConfigure.sh) you can configure the ports to be opend in site-info.def with the varibales...
- DCACHE_PORT_RANGE_PROTOCOLS_SERVER_GSIFTP
- Sets the portrange for dcache as a GSIFTP server in "passive" mode. Default value is from 50000 till 52000 ("50000,52000").
- DCACHE_PORT_RANGE_PROTOCOLS_CLIENT_GSIFTP
- Sets the portrange for dcache as a GSIFTP client in "active" mode. Default value is from 33115 till 33125 ("33115,33125").
- DCACHE_PORT_RANGE_PROTOCOLS_SERVER_MISC
- Sets the portrange for dcache as a (GSI)DCAP and xrootd server in "passive" mode. Default value is from 60000 till 62000 ("60000,62000").
And this is what dCache developers suggest to configure the firewall. (table taken from dCache book, chapter 22)
| Protocol | Port(s) | Direction | Nodes |
|---|---|---|---|
| dCap | 22125 | incoming | doorDomain (admin node) |
| any | outgoing | pools | |
| GSIdCap | 22128 | incoming | gsidcapDomain (where GSIDCAP=yes in node_config) |
| any | outgoing | pools | |
| GridFTP | 2811 | incoming | gridftpDomain (where GRIDFTP=yes in node_config) |
| 20000-25000 | outgoing (active FTP) | pools | |
| 20000-25000 | incoming (passive FTP) | gridftpDomain | |
| SRM v1 | 8443 | incoming | srmDomain |
| SRM v2 | 8444 | incoming | srmDomain |
To change these configurations after setting up dCache, go to /opt/d-cache/config/dCacheSetup and modify the values for (if you want to adopt the settings proposed by the developers)...
- in the "Java Configuration" section the parameters
-
Dorg.globus.tcp.port.rangeto "20000,25000" -
Dorg.dcache.net.tcp.portrangeto "33115,33215"
-
- in the section "Network Configuration"
-
dCapPortto "22125" -
dCapGsiPortto "22128" -
gsiFtpPortNumberto "2811" -
srmPortto "8443" -
clientDataPortRangeto "20000,25000"
-
administrator's script: prepare.sh
#!/bin/bash# prepare to install dcache server# Declare the variables section ------------# Please insert your actual configuration# The two users '''edguser''' and '''edginfo''' must be added on information provider nodes# They are not needed on other nodes but, since their presence will do no harm, they may be# added on all nodes.# EDGUSER=edguser# EDGINFO=edginfo# from here ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~EDGUSER=edguserEDGINFO=edginfo# till here ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#-> start routinecd /etc/yum.repos.d/
wget http://dgiref.d-grid.de/svn/dgiref/PROD/cf2/repl/root/etc/yum.repos/dcache.repo
# The following repositories are only needed, when you want to over your resources in LCG-environment.wget http://dgiref.d-grid.de/svn/dgiref/PROD/cf2/repl/root/etc/yum.repos/bdii.repo
wget http://dgiref.d-grid.de/svn/dgiref/PROD/cf2/repl/root/etc/yum.repos/certification-ca.repo
wget http://dgiref.d-grid.de/svn/dgiref/PROD/cf2/repl/root/etc/yum.repos/lcg-CA.repo
adduser -r -s /bin/false -d /home/edg $EDGUSER
adduser -r -s /bin/false -d /home/edg $EDGINFO
# clean all for yum to remove unnecessary configurationsyum clean all# Optionally an update can do some good:yum update --exclude *kernel*
#<- end routine
Install
WARNING: Two users edguser and edginfo must be added on information provider nodes (in general this is considered to be the "admin node"). They are not needed on other nodes but, since their presence will do no harm, they may be added on all nodes.
Getting the dCache sources: Now to install finally dCache install the following bundles (yum install should install all given arguments, but it's more reliable to install the bundles separately)
- Getting the dCache sources
- Now to install finally dCache install the following bundles (yum install should install all given arguments, but it's more reliable to install the bundles separately)
administrator's script: install.sh
#!/bin/bash# install dcache script# load parameters from prepare sectioncd `dirname $0`
source prepare.sh# On all nodes except poolnodes:yum install glite-version desy-SE_dcache_admin_postgres# Again if you want to operate in LCG environment you also need lcg-CA and lcg-vomscerts.yum install lcg-CA lcg-vomscerts# On the node(s) that will have the (BDII) information system desy-SE_dcache_info.yum install desy-SE_dcache_info# On poolnodes desy-SE_dcache_poolyum install desy-SE_dcache_pool# So if you want to install everything on a single host,# you may order all into one yum install command
Configure
- Get dCache working
- When all sources are installed, some configurations have to be done. For this purpose dCache brought some configuration script with it:
dCacheConfigure.sh. This script needs two other files for further statements: site-info.def and users.conf. Both of which might be know to people that already have worked with gLite. In fact, it are the same files extended with some dCache specific statements. You can find (and should make a copy to start with) templates located at /opt/d-cache/share/doc/dCacheConfigure/examples.
mkdir /root/nodeconfig cp /opt/d-cache/share/doc/dCacheConfigure/examples/* /root/nodeconfig
- The users.conf
- Maybe your site already has a ready-made users.conf, just get it then. Otherwise the default one shipped with dCache is a good starting point. For documentation on the content please look at CERN TWiki.
- The site-info.def file
Same for the site-info.def, if your site already has one, use it.
- Locating Java
- As dCache uses Java, you will need to know the path to your installation of Java Development Kit (JDK). Please note this is not the path to the runtime environment (JRE), which is often a subdirectory within the JDK.
[root@dcache-node] rpm -qa|grep jdk jdk-1.5.0_16-fcs [root@dcache-node] rpm -ql jdk-1.5.0_16-fcs | grep bin/java /usr/java/jdk1.5.0_16/bin/java /usr/java/jdk1.5.0_16/jre/bin/java
In the above example, the JDK path is /usr/java/jdk1.5.0_16.
You must update your /root/nodeconfig/site-info.def file, altering the values of the following variables:
- MY_DOMAIN
- JAVA_LOCATION
- USERS_CONF
- DCACHE_ADMIN
- DCACHE_POOLS
- DCACHE_DOOR_SRM
- DCACHE_DOOR_GSIFTP
- DCACHE_DOOR_XROOTD
- DCACHE_DOOR_LDAP
- VOS
WARNING: Please be aware that there are two DCACHE_DOOR_XROOTD variables declared in the site-info.def file. This will be fixed with the next release, but in the mean-time, simply delete the extra instance.
When running dCacheConfigure.sh, you must explicitly allow the script to reset various configuration options. This means you must either uncomment the existing lines or add extra lines.
RESET_DCACHE_CONFIGURATION=yes RESET_DCACHE_PNFS=yes RESET_DCACHE_RDBMS=yes
With these lines in place, you must run dCacheConfigure.sh to configure dCache. Be patient, it may take several minutes to run.
WARNING:
After running dCacheConfigure.sh you should remove or comment-out the three lines that start RESET_. Failure to do this will reset vital data when you next run dCacheConfigure.sh, almost certainly resulting in data loss.</p>
Suitable values are given below.
administrator's script: configure.sh
#!/bin/bash# configure dcache server# Declare the variables section ------------# Please insert your actual configuration# VO_NAMES="whitespace-separated list of your supported VOs"# SITE_INFO_DEF=path to the site-info.def# SITE_INFO_DEF_TMP=temporary site-info.def path# MY_DOMAIN="domain name"# JAVA_PATH="path to jdk instance"# USERS_CONF="path to the users.conf"# DCACHE_ADMIN="admin node.\$MY_DOMAIN"# DCACHE_POOLS="pool node.\$MY_DOMAIN:7:/pools/1"# DCACHE_DOOR_SRM="\"door srm node.\$MY_DOMAIN\""# DCACHE_DOOR_GSIFTP="\"door gsiftp node.\$MY_DOMAIN\""# DCACHE_DOOR_GSIDCAP="\"door gsidcap node.\$MY_DOMAIN\""# DCACHE_DOOR_DCAP="\"door dcap node.\$MY_DOMAIN\""# from here ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~VO_NAMES="dgtest "
SITE_INFO_DEF=/root/nodeconfig/site-info.def
SITE_INFO_DEF_TMP=/tmp/site-info.def
MY_DOMAIN="\$(hostname -d)"
JAVA_PATH=/usr/java/jdk1.5.0_16
USERS_CONF=/root/nodeconfig/users.conf
DCACHE_ADMIN="dgiref-dcache.\$MY_DOMAIN"
DCACHE_POOLS="dgiref-dcache.\$MY_DOMAIN:7:/pools/1 \\
dgiref-dcache.\$MY_DOMAIN:7:/pools/2"DCACHE_DOOR_SRM="\"dgiref-dcache.\$MY_DOMAIN\""
DCACHE_DOOR_GSIFTP="\"dgiref-dcache.\$MY_DOMAIN\""
DCACHE_DOOR_GSIDCAP="\"dgiref-dcache.\$MY_DOMAIN\""
DCACHE_DOOR_DCAP="\"dgiref-dcache.\$MY_DOMAIN\""
# till here ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~# check if the value $1 exists and fill it with $2function checkAndFill {
if grep -q $1 $SITE_INFO_DEF_TMP;
thenecho "will be updated: ${1} with ${2} to $SITE_INFO_DEF"
sed -i "/$1/ c\
$1$2" $SITE_INFO_DEF_TMPelseecho "will be inserted: ${1} with ${2} to $SITE_INFO_DEF"
echo $1$2 >> $SITE_INFO_DEF_TMP
fi}cp -f $SITE_INFO_DEF $SITE_INFO_DEF_TMP
# fill in the variables in /root/nodeconfig/site-info.defcheckAndFill MY_DOMAIN= "${MY_DOMAIN}"
checkAndFill USERS_CONF= "${USERS_CONF}"
checkAndFill JAVA_LOCATION= "${JAVA_PATH}"
#Remember to replace dcache-node.fzk.de with the FQDN of this hostcheckAndFill DCACHE_ADMIN= "${DCACHE_ADMIN}"
#Here on the poolnode two pools "1" and "2" are created at /pools with size of 7 GBcheckAndFill DCACHE_POOLS= "${DCACHE_POOLS}"
#it's adviced to set the SRM service on a seperate node#checkAndFill DCACHE_DOOR_SRM= "${DCACHE_DOOR_SRM}"
#it's also adviced for bigger sites to set GridFTP services on separate nodescheckAndFill DCACHE_DOOR_GSIFTP= "${DCACHE_DOOR_GSIFTP}"
checkAndFill DCACHE_DOOR_GSIDCAP= "${DCACHE_DOOR_GSIDCAP}"
#should be only (!) on the admin nodecheckAndFill DCACHE_DOOR_DCAP= "${DCACHE_DOOR_DCAP}"
#if you don't want to offer a door to a protocol then don't set such a value...#DCACHE_DOOR_XROOTD="dcache-doornode.\$MY_DOMAIN"#DCACHE_DOOR_LDAP="dcache-doornode.\$MY_DOMAIN"#configurations for the firewall portscheckAndFill DCACHE_PORT_RANGE_PROTOCOLS_SERVER_GSIFTP= "20000,25000"
checkAndFill DCACHE_PORT_RANGE_PROTOCOLS_CLIENT_GSIFTP= "33115,33215"
checkAndFill DCACHE_PORT_RANGE_PROTOCOLS_SERVER_MISC= "20000,25000"
checkAndFill VOS= "${VO_NAMES}"
mv -f $SITE_INFO_DEF_TMP $SITE_INFO_DEF
# run dCacheConfigure.sh/opt/d-cache/bin/dCacheConfigure.sh -c config_sedcache -s /root/nodeconfig/site-info.def
Proceed
- Starting and stopping dCache
- dCache contains a convenient script (also) called dcache for high level administration of the running services. It's located in th directory /opt/d-cache/bin and accepts one of three arguments each followed by none or more secondary parameters.
- /opt/d-cache/bin/dcache start [cellname] [...]
- This command will start all dCache domains specific for the individual node or a selection if given by their cellname. The script is intelligent, so it won't start up any services which are provided on other hosts. Nethertheless one can manually initiate possible services, which definitely is a bad thing to happen, depending on the service that will be started.
- Examples
- Given the minimalistic optimal setup of dCache, that is 3 hosts, one admin node, one SRM-host and one PNFS-host. When dcache is called with no restrictions in cellnames, this may happen.
administrator's script: proceed.sh
#!/bin/bash# start/stop dcache server# on the admin node/opt/d-cache/bin/dcache start
# on the SRM-node/opt/d-cache/bin/dcache start
# on the PNFS-node/opt/d-cache/bin/dcache start
#Now if a admin wants to brake his dCache system, he can start for example the pnfs cell on the admin node./opt/d-cache/bin/dcache start pnfs
# Because the cell pnfs is known to dCache, it '''does not prevent''' its initialization.# But two different pnfs services will definitely disrupt the over all running services! (So don't try this in production)# A good way of preventing such accidents is to issue a status report for the node...# This command just lists all running dCache cells and all stopped cells that '''should be running'''.# But it leaves out all cells that ''can be run'' on this node./opt/d-cache/bin/dcache status
#Precisely this situation can occur when dCache is installed and configured for the first time.# Sometimes the adminDomain is not started properly...#Usage of the stop argument is exactly analogous to that of start.# It relies on ''lastPID''-files that are created on startup of a cell and may be helpfull to restart stopped cells,# that refuse to get initialized (for some reason)./opt/d-cache/bin/dcache stop utility
Initial test
- Testing installed dCache system
- dCache web interface
- If everything is right configured and running, you may try to call the web interface of your dCache instance inside a browser with the generic adress http://dcache-headnode.yourDomain:2288. For the dCache reference installation this looks like http://dgiref-dcache.fzk.de:2288 and can be viewed here: D-Grid dCache reference installation web interface.
- accessing file system with standard commands
| everything alright | [root@dgiref-dcache ~]# ls /pnfs/fzk.de/data/dgtest/ [root@dgiref-dcache ~]# touch /pnfs/fzk.de/data/dgtest/test.blub [root@dgiref-dcache ~]# ls /pnfs/fzk.de/data/dgtest/ test.blub [root@dgiref-dcache ~]# rm /pnfs/fzk.de/data/dgtest/test.blub rm: remove regular empty file `/pnfs/fzk.de/data/dgtest/test.blub'? y [root@dgiref-dcache ~]# ls /pnfs/fzk.de/data/dgtest/ |
| doesn't work, but that is normal! | [root@dgiref-dcache ~]# cp /bin/bash /pnfs/fzk.de/data/dgtest/test.blub cp: closing `/pnfs/fzk.de/data/dgtest/test.blub': Input/output error |
- copying data using dCache protocols
Use a UI-server and voms-proxies! (dgiref-login.fzk.de)
| without a voms-proxy you get this error | [user@dgiref-login]$ dccp /bin/bash gsidcap://dgiref-dcache.fzk.de:22128/pnfs/fzk.de/data/dgtest/testbin-1 Error ( POLLIN POLLERR POLLHUP) (with data) on control line [3] Failed to create a control line Error ( POLLIN POLLERR POLLHUP) (with data) on control line [5] Failed to create a control line Failed open file in the dCache. Can't open destination file : Server rejected "hello" System error: Input/output error |
[user@dgiref-login]$ voms-proxy-info Couldn't find a valid proxy.
[user@dgiref-login]$ voms-proxy-init -voms dgtest Cannot find file or dir: /home/site/user/.glite/vomses Enter GRID pass phrase: Your identity: /C=DE/O=GermanGrid/OU=FZK/CN=user Creating temporary proxy ................................................... Done Contacting dgrid-voms.fzk.de:15000 [/O=GermanGrid/OU=FZK/CN=host/dgrid-voms.fzk.de] "dgtest" Done Creating proxy ...................................... Done Your proxy is valid until Mon Nov 10 23:26:03 2008
[user@dgiref-login]$ voms-proxy-info subject : /C=DE/O=GermanGrid/OU=FZK/CN=user/CN=proxy issuer : /C=DE/O=GermanGrid/OU=FZK/CN=user identity : /C=DE/O=GermanGrid/OU=FZK/CN=user type : proxy strength : 512 bits path : /tmp/x509up_u10824 timeleft : 11:59:58
Now you can copy data into dCache (using GSIdcap):
| first time issued, getting error message but file exists in dCache (see below) | [user@dgiref-login]$ dccp /bin/bash gsidcap://dgiref-dcache.fzk.de:22128/pnfs/fzk.de/data/dgtest/testbin-1 Command failed! Server error message for [1]: "no such file or directory /pnfs/fzk.de/data/dgtest/testbin-1" (errno 10001). 585908 bytes in 0 seconds |
| cannot overwrite/edit files in dCache (that's intended!) | [user@dgiref-login]$ dccp /bin/bash gsidcap://dgiref-dcache.fzk.de:22128/pnfs/fzk.de/data/dgtest/testbin-1 Command failed! Server error message for [2]: "File is readOnly" (errno 1). Failed open file in the dCache. Can't open destination file : "File is readOnly" System error: Input/output error |
| you may only write to directories linked to your VO! this is also intended! dCache admins have detailed control over user privileges in the future |
[user@dgiref-login]$ dccp /bin/bash gsidcap://dgiref-dcache.fzk.de:22128/pnfs /fzk.de/data/textgrid/testbin-1 Command failed! Server error message for [1]: "no such file or directory /pnfs/fzk.de/data/textgrid/testbin-1" (errno 10001). Command failed! Server error message for [2]: "Permission denied (Parent)" (errno 2). Failed open file in the dCache. Can't open destination file : "Permission denied (Parent)" System error: Input/output error |
For checking copy the file back and make a checksum:
[user@dgiref-login]$ dccp gsidcap://dgiref-dcache.fzk.de:22128/pnfs/fzk.de/data/dgtest/testbin-1 /tmp/testfilefromdCache 585908 bytes in 0 seconds [user@dgiref-login]$ md5sum /bin/bash /tmp/testfilefromdCache dc4e36cfdf491029a67f4e317cab3151 /bin/bash dc4e36cfdf491029a67f4e317cab3151 /tmp/testfilefromdCache
Same procedure with srmcp tool (using GridFTP protocol) (also known as "srm put"):
| Note: Usage of srmcp instead of dccp! Don't worry about the error message returned. |
[user@dgiref-login]$ srmcp file:////bin/bash gridftp://dgiref-dcache.fzk.de:2811/pnfs/fzk.de/data/dgtest/testbin-2 WARNING: SRM_PATH is defined, which might cause a wrong version of srm client to be executed WARNING: SRM_PATH=/opt/d-cache/srm |
And again fetching the file from dCache ("srm get"). Unfortunately dCache in the D-Grid reference installation only supports srm put in stream mode, which means, transfering data with only one single stream (normally up to ten). The reason for this is unknown and has to be investigated.
| Note: Usage of srmcp instead of dccp! Don't worry about the error message returned. |
[user@dgiref-login]$ srmcp -debug=true -streams_num=1 \ > gridftp://dgiref-dcache.fzk.de:2811/pnfs/fzk.de/data/dgtest/testbin-2 file:////dev/null WARNING: SRM_PATH is defined, which might cause a wrong version of srm client to be executed WARNING: SRM_PATH=/opt/d-cache/srm |
- Deleting files in dCache
Easiest is to delete the namespace entry:
[root@dgiref-dcache ~]# rm /pnfs/fzk.de/data/dgtest/testbin-1 rm: remove regular empty file `/pnfs/fzk.de/data/dgtest/testbin-1'? y
File will disapear (not immediate but) short after so dCache can prioritize read actions over deleting.
Regular users can delete files from the UI with srmrm (equals srm -rm):
[user@dgiref-login]$ srmrm srm://dgiref-dcache.fzk.de:8443/pnfs/fzk.de/data/dgtest/testbin-1 WARNING: SRM_PATH is defined, which might cause a wrong version of srm client to be executed WARNING: SRM_PATH=/opt/d-cache/srm
- GUI pcells for dCache
The core mechanism for administrating a dCache instance is the admin interface. This is a service you may connect to using a ssh client. Using the admin interface, you may communicate with the various components making up dCache, query their status and update their behaviour. Although dCache is a distributed system, you only ever connect to a single node; dCache will route your messages internally. The source for pcells, a graphical user interface which greately simplifys working with the admin interface, as well as an installation guide can be found at dCache.org. Once pcells is installed, this are the steps to take in order to connect to your dCache system:
- start pcells and open new session
- adjust settings by clicking on Setup
- addresses
- hostname = dCache_headnode.yourDomain
- addresses
- login as admin with (default) passphrase dickerel
Of course no one is forced to use this GUI but can still access the admin interface with a plain ssh client:
ssh -1 -c blowfish -p 2223 -l admin dCache_headnode.yourDomain
Guidance on how to use the admin interface is out of the scope of this documentation. Please look into the wiki at dCache.org.
administrator's script: test.sh
#!/bin/bash# test
Update
To update the dCache server state, use:
administrator's script: update.sh
#!/bin/bash# update dcache server# Removeyum remove glite-version desy-SE_dcache_admin_postgres lcg-CA lcg-vomscerts desy-SE_dcache_info desy-SE_dcache_pool