Title: DPM Installation
1DPM Installation
Rosanna Catania Consorzio COMETA Joint
EELA-2/EGEE-III tutorial for trainers Catania,
2008 June 30th 2008 July 4th
2Outline
- Overview
- Installation
- Administration and troubleshooting
- References
3Outline
- Overview
- Installation
- Administration
- Troubleshooting
- References
4DPM Overview
- a file is considered to be a Grid file if it is
both physically present in a SE and registered in
the file catalogue. gLite 3.1 User Guide
p.103 - The Storage Element is the service which allows a
user or an application to store data for future
retrieval. All data in a SE must be considered
read-only and therefore can not be changed unless
physically removed and replaced. Different VOs
might enforce different policies for space quota
management. - The Disk Pool Manager (DPM) is a lightweight
solution for disk storage management, which
offers the SRM (Storage Resource Manager)
interfaces (2.2 released in DPM version 1.6.3)?.
5DPM Overview
- Each DPMtype Storage Element (SE), is composed
by an head node and a disk server on the same
machine. - The DPM head node has to have al least one
filesystem in this pool, and then an arbitrary
number of disk servers can be added by YAIM. - The DPM handles the storage on Disk Servers. It
handles pools a pool is a group of file systems,
located on one or more disk servers. The DPM Disk
Servers can have multiple filesystems in the
pool.
6DPM Overview
7DPM Overview
- Usually the DPM head node hosts
- SRM server (srmv1 and/or srmv2) receives the
SRM requests and pass them to the DPM server - DPM server keeps track of all the requests
- DPM name server (DPNS) handles the namespace
for all the files under the DPM control - DPM RFIO server handles the transfers for the
RFIO protocol - DPM Gridftp server handles the transfer for the
Gridftp protocol.
8DPM Overview
- The Storage Resource Manager (SRM) has been
designed to be the single interface (through the
correspond-ing SRM protocol) for the management
of disk and tape storage resources. Any type of
Storage Element in WLCG/EGEE offers an SRM
interface except for the Classic SE, which is
being phased out. SRM hides the complexity of the
resources setup behind it and allows the user to
request files, keep them on a disk buffer for a
specified lifetime (SRM 2.2 only), reserve space
for new entries, and so on. SRM offers also a
third party transfer protocol between different
endpoints, not supported however by all SE
implementations. It is important to notice that
the SRM protocol is a storage management protocol
and not a file access one.
9DPM strengths
- Easy to install/configure
- Few configuration files
- Manageable storage
- Logical Namespace
- Easy to add/remove file systems
- Low maintenance effort
- Supports as many disk servers as needed
- Low memory footprint
- Low CPU utilization
10What kind of machines?
- 2Ghz processor with 512MB of memory (not a hard
requirement) - Dual power supply
- Mirrored system disk
- Database backups
11Before installing
- For each VO, what is the expected load?
- Does the DPM need to be installed on a separate
machine ? - How many disk servers do I need ?
- Disk servers can easily be added or removed later
- Which file system type ?
- At my site, can I open ports
- 5010 (Name Server)?
- 5015 (DPM server)?
- 8443 (srmv1)?
- 8444 (srmv2)?
- 8446 (srmv2.2)?
- 5001 (rfio)?
- 20000-25000 (rfio data port)?
- 2811 (DPM GridFTP control port)?
- 20000-25000 (DPM GridFTP data port)?
12Firewall Configuration
- The following ports have to be open
- DPM server port 5015/tcp must be open locally at
your site at least (can be incoming access as
well), - DPNS server port 5010/tcp must be open locally
at your site at least (can be incoming access as
well), - SRM servers ports 8443/tcp (SRMv1) and 8444/tcp
(SRMv2) must be opened to the outside world
(incoming access), - RFIO server port 5001/tcp must be open to the
outside world (incoming access), in the case your
site wants to allow direct RFIO access from
outside, - Gridftp server control port 2811/tcp and data
ports 20000-25000/tcp (or any range specified by
GLOBUS_TCP_PORT_RANGE) must be opened to the
outside world (incoming access).
13Outline
- Overview
- Installation
- Administration and troubleshooting
- References
14What kind of machines?
- Install SL4 using SL4.X repository (CERN mirror)
choosing the following rpm groups - X Window System
- Editors
- X Software Development
- Text-based Internet
- Server Configuration Tools
- Development Tools
- Administration Tools
- System Tools
- Legacy Software Development
- For 64 bits machines, you have to select also the
following groups (not tested) - Compatibility Arch Support
- Compatibility Arch Development Support
15Installation Pre-requisites
- Start from a machine with Scientific Linux CERN
4.X i386 installed. - Prepare file systems (dir /data, not /dpm !). All
the file systems have to have the following
permissions - ls -ld /data01
- drwxrwx--- 3 dpmmgr dpmmgr 4096 Jun 9 1214
data01 - Syncronization among all gLite nodes is
mandatory. It can be achieved by the NTP protocol
with a time server. - Install ntp if not already available for your
system - yum install ntp
- Add your time server in /etc/ntp.conf
- restrict lttime_server_IP_addressgt mask
255.255.255.255 nomodify notrap noquery
- server lttime_server_IPgt
- (you can use NTP server ntp-1.infn.it)?
- Edit /etc/ntp/step-tickers adding your(s) time
server(s) hostname - Activate the ntpd service with the following
commands - ntpdate ltyour ntp server namegt
- service ntpd start
- chkconfig ntpd on
16Repository settings
-
- ig_SE_dpm_disk REPOS"ca dag glite-se_dpm_disk
ig jpackage gilda" - ig_SE_dpm_mysql REPOS"ca dag glite-se_dpm ig
jpackage gilda - REPOS"ca dag glite-se_dpm glite-se_dpm_disk ig
jpackage gilda - for name in REPOS do wget http//grid018.ct.infn
.it/mrepo/repos/name.repo -O /etc/yum.repos.d/na
me.repo done - yum clean all
- yum update
17Installation Pre-requisites
- Install JDK 1.5.0 before installing the
metapackage - yum install jdk java-1.5.0-sun-compat
- rpm -ihv http//grid-it.cnaf.infn.it/mrepo/ig_sl4-
i386/RPMS.3_1_0_externals/jdk-1.5.0_14-fcs.i586.rp
m - rpm -ihv http//grid-it.cnaf.infn.it/mrepo/ig_sl4-
i386/RPMS.3_1_0_externals/java-1.5.0-sun-compat-1.
5.0.14-1.sl4.jpp.noarch.rpm
18Installation
- We are ready to install a DPM server and a Disk
Server on the same machine, this command will
download and install all the needed packages - yum install ig_SE_dpm_mysql ig_SE_dpm_disk
- Install all Certificate Autorities
- yum install lcg-CA
- If you plan to use certificates released by
unsupported EGEE CAs, be sure that their public
key, signing policy and CRLs (usually distributed
with an rpm) are installed in /etc/grid-security/c
ertificates. Install ca_GILDA and gilda-vomscerts - yum install gilda_utils
19Installation
- If metapackage installation reports some missing
dependencies, this is probably due to the
protection normally set on the OS repositories.
In this cases the metapackage requires a higher
version of a package than the one present in the
OS repository, usualy provided by the DAG
repository - perl-XML-NamespaceSupport 100
2.1 kB 0000 - ---gt Package perl-XML-NamespaceSupport.noarch
01.08-6 set to be updated - --gt Running transaction check
- --gt Processing Dependency perl-SOAP-Lite gt 0.67
for package gridview-wsclient-common - --gt Finished Dependency Resolution
- Error Missing Dependency perl-SOAP-Lite gt 0.67
is needed by package gridview-wsclient-common - wget http//linuxsoft.cern.ch/dag/redhat/el4/en/i3
86/RPMS.dag/perl-SOAP-Lite-0.69-1.el4.rf.noarch.rp
m - yum localinstall perl-SOAP-Lite-0.69-1.el4.rf.noar
ch.rpm
20Security
- Hostname -f
- Install host certificate
- Download your certificates in /etc/grid-security
- mv hostxx-cert.pem /etc/grid-security/hostcert.pem
- mv hostxx-key.pem /etc/grid-security/hostkey.pem
- and set proper permissions
- chmod 644 /etc/grid-security/hostcert.pem
- chmod 400 /etc/grid-security/hostkey.pem
- http//security.fi.infn.it/CA/docs
21Site Configuration Files (1/4)
- All the configuration values to sites have to be
configured in a site configuration file using
key-value pairs. - This file is shared among all the different gLite
node types. So edit once and keep it in a safe
place - Create a copy of /opt/glite/yaim/examples/site-inf
o.def template (coming from the lcg-yaim RPM) to
your reference directory for the installation
(e.g. /root) - cp /opt/glite/yaim/examples/siteinfo/ig-site-inf
o.def /opt/glite/yaim/etc/gilda/gilda-site-info.de
f - The general syntax of the file is a sequence of
bash-like assignments of variables
(ltvariablegtltvaluegt, no spaces are allowed around
). - A good syntax test for your site configuration
file is to try to source it manually running the
command - source my-site-info.def
22Site Configuration File (2/4)
- Set the following variables
- MY_DOMAINtrigrid.it
- JAVA_LOCATION/usr/java/jdk1.5.0_14
- DPM_HOSThostxx.MY_DOMAIN
- DPMPOOLPermanent (Volatile)
- The DPM can handle two different kinds of file
systems - volatile the files contained in a
volatile file system can be removed by the system
at any time, unless they are pinned by a user. - permanent the files contained in a
permanent file system cannot be removed by the
system.
23Site Configuration File (3/4)
- Set the following variables
- DPM_FILESYSTEMS"DPM_HOST/data"
- DPM_DB_USERdpmmgr
- DPM_DB_PASSWORDdpmmgr_password
- DPM_DB_HOSTDPM_HOST
- DPMFSIZE200
- MYSQL_PASSWORDyour_DB_root_passwd
- VOS"gilda
- SE_LIST"DPM_HOST
- SE_ARCHmultidisk
- ALL_VOMS_VOS"gilda
- RFIO_PORT_RANGE"20000 25000"
24Site Configuration File (3/4)
- Check
- Copy users and groups example files to
/opt/glite/yaim/etc/gilda/ - cp /opt/glite/yaim/examples/ig-groups.conf
/opt/glite/yaim/etc/gilda/cp /opt/glite/yaim/exam
ples/ig-users.conf /opt/glite/yaim/etc/gilda/ - Append gilda and geclipsetutor users and groups
definitions to /opt/glite/yaim/etc/gilda/ig-users.
conf - cat /opt/glite/yaim/etc/gilda/gilda_ig-users.conf
gtgt /opt/glite/yaim/etc/gilda/ig-users.confcat
/opt/glite/yaim/etc/gilda/gilda_ig-groups.conf gtgt
/opt/glite/yaim/etc/gilda/ig-groups.conf - Define new path of your USERS_CONF and
GROUPS_CONF file in /opt/glite/yaim/etc/gilda/ltyou
r_site-info.defgt - GROUPS_CONF/opt/glite/yaim/etc/gilda/ig-groups.co
nfUSERS_CONF/opt/glite/yaim/etc/gilda/ig-users.c
onf
25gLite Middleware Configuration
- Now we can configure the node
- /opt/glite/yaim/bin/ig_yaim -c -s site-info.def
-n ig_SE_dpm_mysql -n ig_SE_dpm_disk - After configuration remember to manually run the
script /etc/cron.monthly/create-default-dirs-DPM.s
h as suggested by yaim log. This script create
and set the correct permissions on VO storage
directories it will be run monthly via cron.
26Outline
- Overview
- Installation
- Administration and troubleshooting
- References
27Adding a Disk Server (1/2)
- On the Disk Server, repeat the slides 14-23 on
disk server and thenedit the site.def add your
new file system - DPM_FILESYSTEMS"disk_server02.ct.infn.it/storage
02" - yum install ig_SE_dpm_disk
/opt/glite/yaim/bin/ig_yaim -c -s site-info.def
-n ig_SE_dpm_disk - On the Head Node dpm-addfs -poolname
Permanent -server Disk_Server_Hostname -fs
/storage02
28Adding a Disk Server (2/2)
- root_at_wm-user-25 root dpm-qryconf
- POOL testpool DEFSIZE 200.00M GC_START_THRESH 0
GC_STOP_THRESH 0 DEF_LIFETIME 7.0d DEFPINTIME
2.0h MAX_LIFETIME 1.0m MAXPINTIME 12.0h
FSS_POLICY maxfreespace GC_POLICY lru RS_POLICY
fifo GIDS 0 S_TYPE - MIG_POLICY none RET_POLICY R - CAPACITY 9.82G FREE
2.59G ( 26.4) - wm-user-25.gs.ba.infn.it /data CAPACITY 4.91G
FREE 1.23G ( 25.0) - wm-user-24.gs.ba.infn.it /data01 CAPACITY 4.91G
FREE 1.36G ( 27.7) - root_at_wm-user-25 root
29Load balancing
- Load balancing
- DPM automatically round robins between file
systems - Example
- disk01 1TB file system
- disk02 very fast, 5TB file system
- Solution 1 one file system per disk server
- A file will be stored on either disk, equally, if
space left - Solution 2 one file system on disk01
- two file systems on disk02
- A file will more often end up on disk02, which is
what you want
30 Restrict a pool to one or several VOs/groups
- By default, a pool is generic users from all
VOs/groups will be able to write in it. - But it is possible to restrict a pool to one or
several VOs/groups. See the dpm-addpool and
dpm-modifypool man pages. - For instance
- Possibility to dedicate a pool to several
groups - dpm-addpool --poolname poolA --group
alice,cms,lhcb - dpm-addpool --poolname poolB --group
atlas - Add groups to existing list
- dpm-modifypool --poolname poolB --group
dteam - Remove groups from existing list
- dpm-modifypool --poolname poolA --group
-cms - Reset list to new set of groups ( sign
optional for backward compatibility) - dpm-modifypool --poolname poolA --group
dteam - Add group and remove another one
- dpm-modifypool --poolname poolA --group
dteam,-lhcb
31Obtained Configuration (1)
- RFIO, GridFTP parents run as root
- Dedicated user/group
- DPM, DPNS, SRM daemons run as dpmmgr
- Several directories/files belong to dpmmgr
- Host certificate, key
gt ll /etc/grid-security/ grep pem-rw-r--r--
1 root root 5430 May 28 2202
hostcert.pem-r-------- 1 root
root 1675 May 28 2202 hostkey.pem gt ll
/etc/grid-security/dpmmgr/ grep
pem-rw-r--r-- 1 dpmmgr dpmmgr 5430 May
28 2202 dpmcert.pem-r-------- 1 dpmmgr
dpmmgr 1675 May 28 2202 dpmkey.pem
32Obtained Configuration (2)
- Database connect
- /opt/lcg/etc/NSCONFIG
- /opt/lcg/etc/DPMCONFIG
- ltusernamegt/ltpasswordgt_at_ltmysql_servergt
- Daemons
- service ltservice_namegt startstopstatus
- Important services not restarted by RPM upgrade !
33Obtained Configuration (3)Virtual Ids
- Each user and each group is internally mapped to
a "virtual Id". - The mappings are stored in the Cns_userinfo
table, for the users the Cns_groupinfo table,
for the groups - mysqlgt use cns_db
- mysqlgt select from Cns_groupinfo
- -----------------------
- rowid gid groupname
- -----------------------
- 1 101 dteam
- 2 102 atlas
- 3 103 cms
- 4 104 babar
- 5 105 infngrid
- -----------------------
- mysqlgt select from Cns_userinfo
- -----------------------------------------------
----------------------- - rowid userid username
- -----------------------------------------------
----------------------- - 1 101 /CCH/OCERN/OUGRID/CNSophie
Lemaitre 2268
34Testing a DPM (1/7)
root_at_infn-se-01 root dpm-qryconf POOL
Permanent DEFSIZE 200.00M GC_START_THRESH 0
GC_STOP_THRESH 0 DEF_LIFETIME 7 .0d DEFPINTIME
2.0h MAX_LIFETIME 1.0m MAXPINTIME 12.0h
FSS_POLICY maxfreespace G C_POLICY lru RS_POLICY
fifo GID 0 S_TYPE - MIG_POLICY none RET_POLICY R
CAPACITY 21.81T FREE
21.81T (100.0) infn-se-01.ct.pi2s2.it /gpfs
CAPACITY 21.81T FREE 21.81T (100.0) root_at_infn-se
-01 root
35Testing a DPM (2/7)
root_at_infn-se-01 root dpns-ls -l / drwxrwxr-x
1 root root 0 Jun 12
2017 dpm root_at_infn-se-01 root dpns-ls -l
/dpm drwxrwxr-x 1 root root
0 Jun 12 2017 ct.pi2s2.it root_at_infn-se-01
root dpns-ls -l /dpm/ct.pi2s2.it drwxrwxr-x 4
root root 0 Jun 12 2017
home root_at_infn-se-01 root dpns-ls -l
/dpm/ct.pi2s2.it/home drwxrwxr-x 0 root 104
0 Jun 12 2017
alice drwxrwxr-x 1 root 102
0 Jun 13 2311 cometa drwxrwxr-x 0 root
105 0 Jun 12 2017
infngrid root_at_infn-se-01 root
36Testing a DPM (3/7)
- Try the previous two tests from a UI, after you
have initialized a valid proxy and exported
following variables - rosanna_at_infn-ui-01 root export
DPM_HOSTyour_dpm - rosanna_at_infn-ui-01 root export
DPNS_HOSTyour_dpns
37Testing a DPM (4/7)
rosanna_at_infn-ui-01 rosanna globus-url-copy
file//PWD/hostname.jdl gsiftp//infn-se-01.ct.pi
2s2.it/tmp/myfile rosanna_at_infn-ui-01
rosanna rosanna_at_infn-ui-01 rosanna
globus-url-copy gsiftp//infn-se-01.ct.pi2s2.it/tm
p/myfile file//PWD/hostname.jdl
rosanna_at_infn-ui-01 rosanna
rosanna_at_infn-ui-01 rosanna edg-gridftp-ls
gsiftp//infn-se-01.ct.pi2s2.it/dpm rosanna_at_infn-
ui-01 rosanna rosanna_at_infn-ui-01 rosanna
dpns-ls -l /dpm/ct.pi2s2.it/home/cometa
38Testing a DPM (5/7)
- lcg_utils (from a UI)?
- If DPM not in site BDII yet
- export LCG_GFAL_INFOSYShostxx.trigrid.it2170
- lcg-cr v --vo infngrid d hostxx.trigrid.it
file/dir/file - Otherwise
- export LCG_GFAL_INFOSYShostxx.trigrid.it2170
- lcg-infosites --vo gilda se grep ltyour_SEgt
- lcg-cr v --vo dteam d dpm01.cern.ch
file/path/to/file - lcg-cp --vo gilda guidltyour_guidgt file/dir/file
- rfio (from a UI)?
- export LCG_RFIO_TYPEdpm
- export DPNS_HOSTdpm01.cern.ch
- export DPM_HOSTdpm01.cern.ch
- rfdir /dpm/cern.ch/home/myVO
- rfcp /dpm/cern.ch/home/myVO/myfile /tmp/myfile
39Testing a DPM (6/7)
rosanna_at_infn-ui-01 rosanna lfc-mkdir
/grid/cometa/test rosanna_at_infn-ui-01 rosanna
lfc-ls /grid/cometa/test test ... rosanna_at_infn-
ui-01 rosanna lcg-cr --vo cometa
file/home/rosanna/hostname.jd l -l
lfn/grid/cometa/test05.txt -d infn-se-01.ct.pi2s2
.it guid99289f77-6d3b-4ef2-8e18-537e9dc7cccf ros
anna_at_infn-ui-01 rosanna lcg-cp --vo cometa
lfn/grid/cometa/test05.txt filePWD/test05.rep.t
xt rosanna_at_infn-ui-01 rosanna
40Testing a DPM (7/7)
rosanna_at_infn-ui-01 rosanna lcg-infosites --vo
cometa se Avail Space(Kb) Used Space(Kb) Type
SEs ----------------------------------------------
------------ 7720000000 n.a n.a
inaf-se-01.ct.pi2s2.it 21810000000 n.a
n.a infn-se-01.ct.pi2s2.it 4090000000
n.a n.a unime-se-01.me.pi2s2.
it 21810000000 n.a n.a
infn-se-01.ct.pi2s2.it 21810000000 n.a
n.a infn-se-01.ct.pi2s2.it 14540000000
n.a n.a unipa-se-01.pa.pi2s2.it
rosanna_at_infn-ui-01 rosanna
rosanna_at_infn-ui-01 rosanna ldapsearch -x -H
ldap//infn-ce-01.ct.pi2s2.it2170 -b
mds-vo-nameresource, ogrid grep
AvailableSpace (GlueSAStateUsedSpace) GlueSAStateA
vailableSpace 21810000000
41Log Files
- Logs to check
- /var/log/messages
- /var/log/fetch-crl-cron.log
- /var/log/edg-mkgridmap.log
- /var/log/lcgdm-mkgridmap.log
42Log Files
- DPM server
- /var/log/dpm/log
- DPM Name Server
- /var/log/dpns/log
- SRM servers
- /var/log/srmv1/log
- /var/log/srmv2/log
- /var/log/srmv2.2/log
- RFIO server
- /var/log/rfiod/log
- DPM-enabled GridFTP
- /var/log/dpm-gsiftp/gridftp.log
- /var/log/dpm-gsiftp/dpm-gsiftp.log
43checking
- Check and eventually fix ownership and
permissions of - ls -ld /etc/grid-security/gridmapdir
- drwxrwxr-x 2 root dpmmgr 12288 Jun 1
1425 /etc/grid-security/gridmapdir - also check permissions of all the file systems on
each disk server ls -ld /data01drwxr-xr-x 3
dpmmgr dpmmgr 4096 Jun 9 1214 data01
44checking
root_at_aliserv1 root df -Th Filesystem Type
Size Used Avail Use Mounted on /dev/sda1
ext3 39G 3.2G 34G 9 / /dev/sda3
ext3 25G 20G 3.8G 84 /data none
tmpfs 1.8G 0 1.8G 0 /dev/shm /dev/gpfs0
gpfs 28T 2.3T 26T 9
/gpfsprod root_at_aliserv1 root
45Services and their starting order
- On the DPNS server machine service dpnsdaemon
start - On each disk server managed by the DPM
service rfiod start - On the DPM and SRM server machine(s) service
dpm start service srmv1 start service srmv2
start - service srmv2.2 start
- On each disk server managed by the DPM
service dpm-gsiftp start
46Outline
- Overview
- Installation
- DPM service
- Troubleshooting
- References
47Other problems ?
- gLite 3.1 User Guide
- http//igrelease.forge.cnaf.infn.it/doku.php?iddo
cguidesinstall-3_1 - GILDA gLite3.1 Wiki
- https//grid.ct.infn.it/twiki/bin/view/GILDA/Glite
ElementsInstallation - Main DPM documentation page
- https//twiki.cern.ch/twiki/bin/view/LCG/DataManag
ementTop - DPM Admin Guide
- https//twiki.cern.ch/twiki/bin/view/LCG/DpmAdminG
uide - LFC DPM Troubleshooting( https//twiki.cern.ch/
twiki/bin/view/LCG/LfcTroubleshooting )?
48Questions