Title: Notes on archiving strategies
1Notes on archiving strategies
- Andrei Maslennikov
- CASPUR Consortium, Rome
- Bologna, March 2002
2Will be discussed
- CASPUR few numbers - Tapes
are still valid - HSM and Staging - Current
CASPUR system - Upcoming changes -
Hardware elements Fibre Channel and
Distributed Tapes Some tape media and
libraries - Main issues to consider when
starting anew
3CASPUR factsheet
- Inter-University computing consortium.
64 employees, 7.5 MEuro/year -
Non-profit Computer Centre - 200
modern CPUs for number crunching, with
application-level support - plus
around 100 service nodes - plus
some 100 external nodes under support contracts,
(Universities and research orgs,
INFN among them) - 2.5 TB of
distributed storage (AFS,NFS) with automated
backup. Own staging system with up to 30
TB of native tape cache.
Tapes and server disk systems on FC SAN.
- Support for research and
governmental organizations
(applications, systems, network) - Other
services (Web of Science, Digital Library etc)
4Tapes still cost less than disks
- At least factor 5 in cost per GB - Rule of
thumb 10 of data kept on disk, 90 on tape
5HSM
- HSM stands for Hierarchical Storage
Management. True HSM systems allow to view a
mutlitier storage as a virtually infinite disk
space. Users access the disk areas
without bothering about physical location of
the files. Some points - HSM is probably
the only solution when vendor applications in
binary format must manipulate files inside the
multitier filestore - HSM may have a lot of
extra bells and whistles (such as replication,
data compression, second copy
etc) and may be used only for data moving
purposes (files later may be
accessed via NFS) - HSM may be quite
expensive, especially when it is priced on the
per-GB basis - HSM may significantly
reduce a freedom of choice for levels of
OS/kernel on both server and client
machines - Some brands IBM HPSS and
ADSM(TSM), Legato Networker, STK ASM, UniTree
6Staging
- Staging system is a generic name for a
tape-to-disk migration tool. The files are
migrated by user before they are about to be
accessed on the disk. Migration of the
disk files to tape may be automatic or manual.
- Older known staging
implementations required the user to
keep track of his/her tape files (old CERN
staging) - CASPUR flavour (in
production since 1997) does the tape/file
bookkeeping on behalf of the user.
Files are adressed by name. This
system is fairly easy installation and
management. - CASTOR
(CERN, project started in 1999) gives a user an
option to migrate / access files
both manually, and via the specially modified i/o
calls from within a
program. Addressing is also done by name.
Uses a fast data transfer protocol
(RFIO). May be quite complex
to install and maintain.
7Current CASPUR solution
- Multiple NFS filesystems are mounted on
clients and on the Staging Server - Staging
Serever is capable to handle simultaneously
multiple staging requests and tape drives.
FC tapes
Staging Server
/stage, /data, ...
8CASPUR Solution some features
- Tape / File database mysql - All the
rest is written in perl. Works on any UNIX -
File headers contain full information on the disk
file, database may be fully reconstructed via
tape scan - Automatic occupancy
watermark support - May easily be
adjusted for new migration policies
9Staging II (upcoming)
- Staging Server will only handle staging
requests and dispatch them to the Movers -
Surprisingly AFS outperforms NFS for large file
access, so we may stage into AFS
Staging Server
FC tapes
Clients
/stage, /data, /afs
Data Movers
/stage, /data, /afs ...
NFS/AFS Servers
/stage, /data, /afs ...
10Staging II - comments
- Scalable (more data movers and file
servers may be added dynamically)
- Requires distributed tapes on
SAN (every mover should have access to every
drive / cartridge) - Does not
depend on the make and capacity of components
- Staging into AFS may become strategic
infinite, scalable, secure filestore Some
(yet inofficial) benchmarks on Linux suggest
that AFS may successfully compete with NFS
for large (gt 1GB) file transfers.
11FC basic facts
Fibre Channel a new, high-speed serial
interface - Data rates in excess of 200 MB/sec
today (extensions for 1000 MB/sec approved) -
Current interface of choice for SCSI-3 in the
open systems market - Uses optical fibre (max
10km) and copper (max 30m) cables
Gone are the limitations of the parallel bus -
Signal skew - bits travelling in parallel over a
data bus become displaced in time (implies
limitations on the bus physical length and bus
cycle time) - Milti-drop bus suffers a
non-negligible overhead to support each new
device
- Amalgam of I/O Channel and Network technologies
- - From the channel world
- Ability to use existing command protocols
- High-performance transfers
- Reliable delivery with low error rates
- - From the network world
- Serial data transmission
- Packaging data in packets (called frames)
- Larger address space to support more devices
- Ability to route information among more devices
using switches
12SAN why we need it
Allows for transparent access to tapes on Servers
- Modern tape drives are quite expensive and are
less numerous than possible services which may
need tape access - Fibre Channel SAN allows for
networkless and hence very efficient tape
sharing across hosts and platforms
We can build HA and File Access solutions at
ease - Disk systems may be shared on SAN in the
HA scenarios. We have tested and now are
implementing one such solution - On Scalable
Parallel complex SP3 we use fc for GPFS backup
nodes - GFS and Sanergy tests that we planned
require FC
We gain an immense freedom of relocation of
equipment - With copper cables you are free to
assign and reassign your fibre channel
peripherals between hosts inside machine room -
Optical cables allow to connect buildings and
create SCSI backbones on campus
13Distributed Tapes at CASPUR
3584 FC Library
LTOs
9740 STK Library
S A N
scsi
9840
4200 bridge
DLT
9740 Mount comands via serial line
(5) Access Tape
3584 mount commands via FC
Tape Mounter Tape Dispatcher
lt (3) Mount command
(6) Free Tape gt
hosts
(2) Lock wait
Mount request / Free tape via LAN
lt (4) Mount rc
(1) Mount request gt
14Media cost (very approximative)
- LTO 15 MB/sec native, 100/200 GB,
roadmap factor 2 in 2002 (quite
reliable, was stress-tested at CERN)
LINEAR (IBMHPSeagate) 120E - AIT-3 12
MB/sec native, 100/200 GB, roadmap AIT-4 factor
2 in 2002/3 (CASPUR will be sress-testing
AIT-3 in March-April) HELICAL SCAN
(Sony) 120E - SDLT 11 MB/sec native
110/220 GB , roadmap none this year
LINEAR (Quantum) Not considered - LTO
Library (IBM 3584 max 10 drives, max 300 slots)
80KE 6 drives 300(?) slots Drive 7-8
(?) KE - AIT-3 Library (Spectralogic Gator
64000, max 32 drives, max 640 slots) 175
KE 6 drives 640 slots Drive 7-8 (?) KE
hot-swappable drives highest density (1 rack
footprint)
15Starting from scratch
- Decide on number of storage tiers and
their relative start-up capacities
- Choose the data serving technology
(HSM or Staging) - Select hardware
that corresponds to the task ( capacity and
troughput should match the required data
flow) - Ensure scalability (make sure that
bottlenecks may be eliminated by adding
more CPUs / peripherals) - Take
care of upgradablity roadmap (best is to be able
to mix drives and libraries of different
vendors, along with changing market)