data:Dcache/1912
Contents |
Introduction
| |||
|
The goal of this project is to provide a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods. The dCache system consists of several components (nodes) which interact over the network
|
| ||
dCache server v1.9.12
Prepare
- Operating system
- Scientific Linux v.5.6 64 bit
Optimizing the configuration:
Use minimal operating system installation without firewall. To verify installed packages use the command
-
rpm -qa | grep package_name
Install the following additional packages:
-
yum -y install wget yum rpm make gcc gcc-c++ tar sed zlib openssl
After the installation is complete, turn off any unnecessary services (like gpm, sendmail, cups, haldaemon, messagebus, pcmcia, anacron, atd) with the following command:
-
chkconfig <SERVICE> off
Configure the following settings for the server:
- proxy
- ntp
- script:/etc/resolv.conf
- java SDK
- PostgreSQL
- UMD and EPEL repos
- The host running the srm transfer service needs to have a valid host certificate and a host key in place (/etc/grid-security/hostcert.pem, /etc/grid-security/hostkey.pem).
- Prepared grid-vorolemap, storage-authzdb [dCache ABA]
- Firewall configuration
For normal communication between dCache server and clients the list of default ports(see below) must be accepted on firewall (dCache default ports)
| Port number | Direction | Component |
|---|---|---|
| 32768 | is used by the NFS layer within dCache which is based upon rpc. This service is essential for rpc. | NFS |
| 1939, 33808 | is used by portmapper which is also involved in the rpc dependencies of dCache. | portmap |
| 34075 | is for postmaster listening to requests for the PostgreSQL database for dCache database functionality. | Outbound for SRM, PnfsDomain, dCacheDomain and doors; inbound for PostgreSQL server. |
| 33823 | is used for internal dCache communication. | By default: outbound for all components, inbound for dCache domain. |
| 8443 | is the SRM port. See Chapter 14, dCache Storage Resource Manager | Inbound for SRM |
| 2288 | is used by the web interface to dCache. | Inbound for httpdDomain |
| 22223 | is used for the dCache admin interface. See the section called “The Admin Interface” | Inbound for adminDomain |
| 22125 | is used for the dCache dCap protocol. | Inbound for dCap door |
| 22128 | is used for the dCache GSIdCap . | Inbound for GSIdCap doors |
administrator's script: prepare.sh
#!/bin/bashsu - root# prepare to install dcache server# install prerequisites:yum -y install postgresql postgresql-libs postgresql-server
chkconfig postgresql on
# install umd#clean oldrm /etc/yum.repos.d/UMD* /etc/yum.repos.d/epel*
wget http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm
wget http://repository.egi.eu/sw/production/umd/1/sl5/x86_64/updates/umd-release-1.0.2-1.el5.noarch.rpm
rm -f epel-release-5-4.noarch.rpm umd-release-1.0.2-1.el5.noarch.rpm
yum install epel-release-5-4.noarch.rpm
yum install yum-prioritiesyum install umd-release-1.0.2-1.el5.noarch.rpm
sed -i -e "s/priority=.*/priority=5/g" /etc/yum.repos.d/UMD-1-base.repo
sed -i -e "s/priority=.*/priority=4/g" /etc/yum.repos.d/UMD-1-updates.repo
#Create this file manualy /etc/grid-security/storage-authzdb;echo "Please change the auth file /etc/grid-security/storage-authzdb. the file should contains all users(nameuser, access type, uid, gid...) which use dcache storage. For example:"
echo '
version 2.1authorize dgop0001 read-write 123001 64019 / / /authorize dgopadm read-write 123999 64019 / / /......'#Also create /etc/grid-security/grid-vorolemap file manualyecho '
"*" "/dgops" dgop0001"*" "/dgops/admin/Role=dataadmin" dgopadm......'
Install
Install dCache server and clients
- dcache-server
- dcap
- libdcap
- dcache-srmclient
administrator's script: install.sh
#!/bin/bash# install dCache server and clients from umdyum install dcache-server dcache-srmclient dcap libdcap-tunnel-gsi libdcap
Configure
Layout
dCache keeps list of the domains and the services that are to be run within these domains in the layout files. Every domain is a separate Java VM.
| dCache provides 3 typical kinds of layout file: head.conf, pools.conf, single.conf. |
| If in layout file included more then one domain for normal communication between domains need to be set the parameter messageBroker=cells. In case when all services is working in one domain this parameter is should be set to none. |
dcache.conf
The most important file which define layout file for node and specific parameters for current node-services (those parameters could be get from properties files from /usr/share/dcache/defaults/ directory)
Chimera
Chimera is a namespace provider for single-rooted view of distributed dCache files. For configuration Chimera:
- Create a database for Chimera
- upload the create.sql, pgsql-procedures.sql
Pools
| dCache pools hold all the data ever written into dCache. These pools are completely independent from the PNFS directories (for the time beeing). In fact, they could be created anywhere and then mounted locally. Do not use the whole disk space available for the pools! dCache needs some additional space to keep record on metadata linked to files stored in each pool. |
All files into dCache system located in pools. The pools could be definded on the several "pool-nodes". For define the pools:
- create the pool directory
- create a new domain for pool or include the pool service to exist domain into layout file
- configure PoolManager.conf
PnfsManager
NFS interface for Chimera provides access to dCache file tree. Pnfs makes possible to get and change information about file without access to file data.
| Pnfs just service for monipulation of file tree. it means that provides an access to dCache file tree. Could create, delete, change attributes of files but can't get derect access to file data. |
| Directory tags in PNFS are metadata, which will be evaluated by dCache and inherited by future subdirectories. |
Publishing Dcache in site-bdii
See details [1]
administrator's script: configure.sh
#!/bin/bash#Configure script for D-Cache from UMD repo#variablesFQDN="dgiref-dcache.fzk.de"
hostname="dgiref-dcache"
domainname="fzk.de"
cluster_name="dgiref"
dcache_path="/etc/dcache/"
support_VOs="aerogrid astrogrid bauvogrid bioif bisgrid biz2grid bwgrid c3grid dgcms dgops dgtest education fingrid gdigrid hepcg ingrid interloggrid kerndgrid lifescience medigrid mosgrid partgrid progrid textgrid wisent"
pools_path="/pools"
pnfs_path="/pnfs"
#Change of access permissions and create databases into postgres for dcacheecho "
local all all trusthost all all 127.0.0.1/32 trusthost all all ::1/128 trust" > /var/lib/pgsql/data/pg_hba.conf
/etc/init.d/postgresql restart
createdb -U postgres chimeracreateuser -U postgres --no-superuser --no-createrole --createdb --pwprompt chimera
psql -U chimera chimera -f /usr/share/dcache/chimera/sql/create.sql
createlang -U postgres plpgsql chimerapsql -U chimera chimera -f /usr/share/dcache/chimera/sql/pgsql-procedures.sql
createuser -U postgres --no-superuser --no-createrole --createdb --pwprompt srmdcache
createdb -U srmdcache dcache#Generate ssh-keys for dcache's admin intefacessh-keygen -b 768 -t rsa1 -f /etc/dcache/server_key -N ""
ssh-keygen -b 1024 -t rsa1 -f /etc/dcache/host_key -N ""
#layout for dcache configurationecho "[dCacheDomain]
[dCacheDomain/poolmanager][dCacheDomain/broadcast][dCacheDomain/loginbroker][dCacheDomain/topo][dCacheDomain/dcap][dCacheDomain/admin][dCacheDomain/spacemanager][dCacheDomain/pnfsmanager][dCacheDomain/cleaner][dCacheDomain/acl][dCacheDomain/nfsv3][dCacheDomain/dir][dCacheDomain/info][dCacheDomain/statistics][dCacheDomain/billing][dCacheDomain/srm-loginbroker][dCacheDomain/httpd][dCacheDomain/gplazma][dCacheDomain/gsi-pam][dCacheDomain/pinmanager][dCacheDomain/pool]name=pool1path=$pools_path/pool1size=1G[dCacheDomain/pool]name=pool2size=1Gpath=$pools_path/pool2[dCacheDomain/gsidcap][dCacheDomain/gridftp][dCacheDomain/srm]">$dcache_path/layouts/$hostname.conf
#dcache.conf configurationecho"
dcache.user=rootdcache.layout=$hostnamepoolmanager.setup.file=$dcache_path/PoolManager.confstagePolicyEnforcementPoint=PoolManagercacheInfo=pnfssrmSpaceManagerEnabled=yesSpaceManagerReserveSpaceForNonSRMTransfers=trueSpaceManagerLinkGroupAuthorizationFileName=$dcache_path/LinkGroupAuthorization.confpoolbasedir=$pools_pathRecursiveDirectoryCreation=truepnfs=$pnfs_pathdcapIoQueue=dcapqgsiftpIoQueue=gridftpqolIoQueue=gridftpq,dcapqremoteGsiftpIoQueue=gridftpqoverwriteEnabled=truesrmGetReqSwitchToAsynchronousModeDelay=0srmPutReqSwitchToAsynchronousModeDelay=0srmLsRequestSwitchToAsynchronousModeDelay=infinity">$dcache_path/dcache.conf
#pools and PoolManager.conf configurationmkdir -p $pools_path/pool1;
mkdir -p $pools_path/pool2;
echo 'psu create unit -net 0.0.0.0/255.255.255.255
psu create unit -net 0.0.0.0/0.0.0.0psu create unit -store *@*psu create unit -protocol */*'>$dcache_path/PoolManager.conf
for vo in $support_VOs;do echo "psu create unit -store $vo:$cluster_name@osm">>$dcache_path/PoolManager.conf; done;
echo 'psu create ugroup any-store
psu addto ugroup any-store *@*psu create ugroup world-netpsu addto ugroup world-net 0.0.0.0/0.0.0.0psu create ugroup any-protocolpsu addto ugroup any-protocol */*'>>$dcache_path/PoolManager.conf
for vo in $support_VOs;do echo "psu addto ugroup any-store $vo:$cluster_name@osm">>$dcache_path/PoolManager.conf; done;
echo 'psu create pool pool1
psu create pool pool2psu create pgroup defaultpsu addto pgroup default pool1psu addto pgroup default pool2psu create link default-link any-store world-net any-protocolpsu set link default-link -readpref=10 -writepref=10 -cachepref=10 -p2ppref=-1psu add link default-link defaultpsu create linkGroup default-linkGrouppsu set linkGroup custodialAllowed default-linkGroup falsepsu set linkGroup replicaAllowed default-linkGroup truepsu set linkGroup nearlineAllowed default-linkGroup falsepsu set linkGroup outputAllowed default-linkGroup falsepsu set linkGroup onlineAllowed default-linkGroup truepsu addto linkGroup default-linkGroup default-linkrc onerror suspendrc set max retries 999rc set max retries 3rc set retry 900rc set warning path billingrc set poolpingtimer 600rc set slope 0.0rc set p2p oncostrc set stage oncost offrc set stage offset timeout pool 120set costcuts -idle=0.0 -p2p=2.0 -alert=0.0 -halt=0.0 -fallback=0.0rc set max copies 500rc set max restore unlimitedrc set sameHostCopy besteffortrc set max threads 0cm set magic on'>>$dcache_path/PoolManager.conf
#LinkGroupAuthorization.conf configurationecho "LinkGroup default-linkGroup" > $dcache_path/LinkGroupAuthorization.conf
for vo in $support_VOs;do echo "/$vo">>$dcache_path/LinkGroupAuthorization.conf; echo "/$vo/admin/Role=dataadmin">>$dcache_path/LinkGroupAuthorization.conf; done;
#dcachesrm-gplazma.policy configurationecho '
xacml-vo-mapping="OFF"saml-vo-mapping="OFF"kpwd="OFF"grid-mapfile="OFF"gplazmalite-vorole-mapping="ON"xacml-vo-mapping-priority="5"saml-vo-mapping-priority="1"kpwd-priority="3"grid-mapfile-priority="4"gplazmalite-vorole-mapping-priority="2"kpwdPath="/opt/d-cache/etc/dcache.kpwd"gridMapFilePath="/etc/grid-security/grid-mapfile"storageAuthzPath="/etc/grid-security/storage-authzdb"XACMLmappingServiceUrl="https://fledgling09.fnal.gov:8443/gums/services/GUMSXACMLAuthorizationServicePort"xacml-vo-mapping-cache-lifetime="180"mappingServiceUrl="https://fledgling09.fnal.gov:8443/gums/services/GUMSAuthorizationServicePort"saml-vo-mapping-cache-lifetime="180"gridVoRolemapPath="/etc/grid-security/grid-vorolemap"gridVoRoleStorageAuthzPath="/etc/grid-security/storage-authzdb"vomsValidation="false"saz-client="OFF"SAZ_SERVER_HOST="saz-server.oursite.edu"SAZ_SERVER_PORT="8888"'>>$dcache_path/dcachesrm-gplazma.policy
#configure info-provider.xml for bdiiwget http://mirror.scc.kit.edu/yum/downloads/confs/info-provider.xml -o $dcache_path/info-provider.xml
echo "For your own configuration info-provider.xml just adapt the file $dcache_path/info-provider.xml";
#pnfs configurationmkdir -p $pnfs_path
echo "
/ localhost(rw)/pnfs *.$domainname" >> /etc/exports
echo "localhost:/pnfs /pnfs" >> /etc/fstab
dcache start
mount localhost:/pnfs /pnfs
mkdir -p /pnfs/$domainname/data/
echo "ONLINE" > /pnfs/$domainname/data/'.(tag)(AccessLatency)'
echo "REPLICA" > /pnfs/$domainname/data/'.(tag)(RetentionPolicy)'
echo "$cluster_name" > /pnfs/$domainname/data/'.(tag)(sGroup)'
for vo in $support_VOs; do
mkdir /pnfs/$domainname/data/$vo;
echo "ONLINE" > /pnfs/$domainname/data/$vo'.(tag)(AccessLatency)'
echo "REPLICA" > /pnfs/$domainname/data/$vo'.(tag)(RetentionPolicy)'
echo "$cluster_name" > /pnfs/$domainname/data/$vo'.(tag)(sGroup)'
echo "StoreName $vo" > /pnfs/$domainname/data/$vo'.(tag)(OSMTemplate)'
done;for dir in `ls /pnfs/$domainname/data`
douname=$(grep -m1 $dir /etc/grid-security/grid-vorolemap | cut -d' ' -f 3)
uid=$(grep $uname /etc/grid-security/storage-authzdb | cut -d' ' -f 4-5 | sed 's/ /:/')
chown $uid /pnfs/$domainname/data/$dir
donedcache stop
Proceed
- Make dCache as a service for OS and add it to autoboot system. So it will be started during boot.
- Start dCache
administrator's script: proceed.sh
#!/bin/sh# start/stop dcache server# make dCache a known service so it will be started during boot phase.su - rootchkconfig --add dcache-serverchkconfig dcache-server on
service dcache start
Initial test
Simple dCache test for client side:
- Grid user init
- Generate random data file
- Make a directory via srm service
- Upload random file to dCache via srm
- Download file from dCache server
- Delete new file and new folder from destination host
- Compare uloaded and downloaded file
If all tests passed, probably your dCache service is working normally.
administrator's script: test.sh
#!/bin/bash#test script for dcacheFQDN="dgiref-dcache.fzk.de"
hostname="dgiref-dcache"
domainname="fzk.de"
cluster_name="dgiref"
voms-proxy-init -voms dgopsdd if=/dev/urandom of=/tmp/testcopy count=1024
srmmkdir srm://$FQDN/pnfs/$domainname/data/test/
srmcp file:////tmp/testcopy srm://$FQDN/pnfs/$domainname/data/dgops/test/testcopy
srmcp srm://$FQDN/pnfs/$domainname/data/dgops/test/testcopy file:////tmp/testcopy2
srmrm srm://$FQDN/pnfs/$domainname/data/dgops/test/testcopy
srmrmdir srm://$FQDN/pnfs/$domainname/data/dgops/test/
diff /tmp/testcopy /tmp/testcopy2
[[ "$?" == 0 ]] && ( echo 'Test passed'; ) || ( echo 'Test not passed' )
Update
| For easy update don't change the settings into /usr/share/dcache/defaults/*.properties manually (all changes should be set into dcache.conf) |
- For installation of new version from umd just use yum update.
administrator's script: update.sh
#!/bin/bash# update dcache serverdcache stop
yum update dcache-serverservice dcache start
dCache extensions
Authorization in dCache
Assume, that dCache is installed and configured including two pools for disk-only files (as it described on data:Dcache/1912/server). Now we need to tell dCache, who is actually allowed to write into and read from the pools. The easiest way of enabling attribute based authorization is via the vorole-mapping. For this another set of files have to be edited.
The official documentation for dCache (“dCache The Book”) has a up-to-date chapter about this method: http://www.dcache.org/manuals/Book/config/cf-gplazma-vorole.shtml.
/etc/dcache/dcachesrm-gplazma.policy
In this file dCache looks up, which methods for authorization are enabled and in which order they are to apply (there may be up to 5 different methods). For our purpose we only need to activate gplazmalite-vorole-mapping:
kpwd="OFF" gplazmalite-vorole-mapping="ON"
Accordingly two other variables need to be checked (remember their value):
# Built-in gPLAZMAlite grid VO role mapping gridVoRolemapPath="/etc/grid-security/grid-vorolemap" gridVoRoleStorageAuthzPath="/etc/grid-security/storage-authzdb"
/etc/grid-security/grid-vorolemap
In the grid-vorolemap the first part for attribute based authorization is done: mapping of user roles onto user names. This file may be generated automatically, but it is very easy maintainable by hand. However, there is no template shipped with the installation of dCache, so one has to create it at /etc/grid-security/grid-vorolemap.
"*" "<group>" <username>
This may look like this.
"*" "/ops" ops001
This line introduces the mapping of every possible DN (the asterisk is used as wildcard, but a specific DN is also valid) together with the attribute /ops onto the user “ops001”. This attribute can also reflect the role a user came with:
"*" "/<group>/<subgroup>" <username> "*" "/<group>/Role=<role>" <username>
At this point it does not matter which user names are used for the mappings as these do not have to be real existing unix useraccounts. In order to compensate for this, the storage-authzdb file is needed.
/etc/grid-security/storage-authzdb
Like said before, this file is needed in order to give the “virtual users” employed in the grid-vorolemap proper uid and gid.
authorize <username> read-write <uid> <gid> / / / authorize ops001 read-write 22001 5850 / / /
Besides this the very first (non-commented) line must specify the version of the storage-authdb format: “version 2.1”.
The three slashes at the end of the line exist mostly for legacy reasons. Have a look at dCache - The Book (http://www.dcache.org/manuals/Book/config/cf-gplazma-authzdb.shtml) for further details.
/etc/dcache/LinkGroupAuthorization.conf
Lastly dCache needs to know, which users and roles are to be allowed working with defined link groups. This is configured inside /etc/dcache/LinkGroupAuthorization.conf. We can either allow any authenticated user and role, or restrict to known VOs.
LinkGroup default-linkGroup */Role=*
or
LinkGroup default-linkGroup <group> <group>/<subgroup> <group>/Role=<role>