Title: Making Reliable and Restorable Backups
1Making Reliable and Restorable Backups
- Presented by
- W. Curtis Preston
- President
- The Storage Group, Inc.
2Making good on your investment
- Many SANs are built in order to simplify backup,
yet often fail for lack of good design, processes
and procedures. - There are several common mistakes that people
make when building a backup system - Avoiding these mistakes and taking proper action,
can create a backup system that is reliable and
restorable
3What will we cover?
- Common Backup Configuration Mistakes
- How to Avoid Them
- Sizing your backup system
- Configuration examples for NetBackup
- Configuration examples for NetWorker
4Common Backup Configuration Mistakes
5Where do these lessons come from?
- Audits of real backup and recovery systems
- Lessons learned from real horror stories
- Many, many sleepless nights
6Too little power
- Not enough tape drives
- Tape drives that arent fast enough
- Not enough slots in the tape library
- Not enough bandwidth to the server
7Too much power
- Streaming tape drives must be streamed
- If you dont, you will wear out your tape drives
and decrease aggregate performance - Must match the speed of the pipe to the speed of
the tape - You can actually increase your throughput by
using fewer tape drives
8Not using multiplexing
- Defined Sending multiple backup jobs to the same
drive simultaneously - Again, drives must be streamed
- Multiplexing will impact restore performance, but
not as much as you might think - Multiplexing can actually help your restore just
as it can help your backups - Using multiplexing can greatly increase the
utilization of your backup hardware
9Not using multistreaming
- Defined Sending multiple simultaneous backup
jobs from a single client - Large systems cannot be backed up serially
- Multistreaming creates a different job for each
filesystem
10Using include lists
- Most major backup software supports file system
discovery - Still, many administrators use manually created
include lists - Any perceived value is significantly outweighed
by the risk it creates
11Too many full backups
- If you are using a commercial backup and recovery
product with automated media management and
multiple levels, weekly full backups are a waste
of tape, time, and money - Monthly full backups, weekly cumulative
incrementals (1), and daily incrementals (9) work
just as well and use ¼ as much tape - Depending on the level of incremental activity,
quarterly backups can work just as well.
12Not standardizing
- Creating custom configurations for each client is
easier, but much riskier - Creating a standard backup client configuration
can significantly decrease risk - Create a standard exclude list, etc. and push it
out to each client
13Not even noticing!
- Backups go ignored so often. Its like theyre
the bill collector nobody wants to talk to - Backup reporting products can really help
automate easy reporting - Dont ignore backups. They will bite you.
14Its just backups, right?
- Im an experienced, seasoned systems
administrator. This is just backups. How hard can
they be? - The data being backed up has become very complex,
and the complexity of backup systems have matched
that complexity with functionality that also
happens to be complex
15Not thinking about disk
- Tape is not as cheap as you thought
- Lets examine a 4 TB library
- 20 slots, 2 drives 17K
- 20 tapes, 70 apiece 14K
- Robotic license 10K
- Total 41K
- (does not include labor costs)
- Thats about 10/GB
16Disk is cheaper than you thought
- ATA-based storage arrays as low as 5/GB(disk
only, needs filesystem) - Special function arrays
- Quantum DX-30 looks and behaves like a Quantum
P1000. Can be used as target for tape-based
backups (3 usable TB, 55K list, or 18/GB) - NetApp R100 looks like other NetApp filer.
Target for SnapVault and disk-based backups,
source for SnapMirror (9 usable TB, 175K list,
or 18/GB) - ATA disks not suited for heavy, random access,
but perfect for large block I/O (e.g. backups!)
17You can do neat things with disk
- Incremental backups are one of the greatest
backup performance challenges - Use as a target for all incremental backups.
(Full, too, if you can afford it) - For off-site storage, duplicate all disk-based
backups to tape - Leave disk-based backups on disk
18Now that I know
- Building a reliable
- and restorable backup system
19Sizing the backup system
20Server Size/Power
- I/O performance more important than CPU power
- CPU, memory, I/O expandability paramount
- Avoid overbuying by testing prospective server
under load - If you use Suns, youve got snoop and truss
21Catalog/database Size
- Determine number of files (n)
- Determine number of days in cycle (d)
- (A cycle is a full backup and its associated
incremental backups.) - Determine daily incremental size (i n .02)
- Determine number of cycles on-line (c)
- 150-250 bytes per file, per backup
- Use a 1.5 multiplier for growth and error
- Index Size (n (id)) c 250 1.5
22Library Size - drives
- Network Backup
- Buy twice as many backup drives as your network
will support - Use only as many drives as the network will
support (You will get more with less.) - Use the other half of the drives for duplicating
23Library Size - drives
- Local Backup
- Most large servers have enough I/O bandwidth to
back themselves up within a reasonable time if
youre using NetBackup - Usually a simple matter of mathematics
- 8 hr window, 8 TBs 1 TB/hr 277 MB/s
- 30 10 Mb/s drives, 15 20 MB/s drives
- Must have sufficient bandwidth to tape drives
- Filesystem vs. raw recoveries
- Allow drives and time for duplicating!
24Library Size - slots (all tape environment)
- Should hold all onsite tapes
- On-site tapes automatically expire and get reused
- Only offsite tapes require phys. mgmt.
- Should monitor library via a script to ensure
that each pool has enough free tapes before you
go home - Watch for those downed drive messages
25Library Size - slots (disk/tape environment)
- Do incremental backups to disk
- Library only needs to hold on-site full tapes and
the latest set of copies. - On-site tapes and disk-based backups
automatically expire and get reused - Only offsite tapes require phys. mgmt.
- Should monitor library and disk via a script to
ensure that each pool has enough free tapes
before you go home - Watch for those downed drive messages
26Local or Remote Backup?
- Throughput (in 8hrs), if you own the wire
- 10 Mb 20 GB, 100 Mb 200 GB
- GbE 500 GB 1 TB (Also must own the box.)
- Greater than 500 GB should be local
- Lan-free backups allow you to share a large tape
library by performing local backups to a
remote, shared device - More than one 500 GBserver, buy a SAN!
- Only one 500 GB server, plan for a SAN!
- (NetBackup SSO, NetWorkerDDS)
27Multistreaming - NetBackup
- Defined Starting multiple simultaneous backup
jobs from a single client - Maximum jobs per client gt 1
- Check Allow multiple data streams
- ALL_LOCAL_DRIVES, or multiple entries in file
list - Maximum jobs per policy gt 1 or unchecked
- Need storage unit with more than one drive, or
one drive with multiplexing enabled - Can change max jobs per client using the Server
Properties -gt Clients tab (4.5) - By default, will not exceed one job per
filesystem, but can bypass this if you make your
own file list
28Multistreaming (Parallelism) - NetWorker
- Use All saveset or multiple entries in the
saveset list - Set the parallelism setting for server and, if
necessary, the storage node - Set client parallelism value in client attributes
- Must have multiple drives available, or one drive
with target sessions set higher than one - Will not exceed number of disks or logical
volumes on the client (see maximum-sessions in
manual)
29Multiplexing NetWorker
- Set target sessions per device, allocating how
many sessions may be sent to that device. - Global setting for all backups that go to that
device
30Multiplexing - NetBackup
- Max multiplexing per drive in storage unit
configuration gt 1 - Media multiplexing in schedule gt 1
- Use higher multiplexing for incremental backups
if going to tape (6-8) - Use lower multiplexing for local backups (2)
- No need to multiplex disk storage units
- Multiple policies can multiplex to the same
drive, - but multiple media servers cannot
31Using Include lists -- not
- NetBackup ALL_LOCAL_DRIVES in file list
- NetWorker All in saveset field
- Automatically excludes NFS/CIFS drives
- Does not include dynamically mounted drives not
in /etc/fstab
32What about database clients?
- Use scripts that parse lists of databases
- /var/opt/oracle/oratab for Oracle
- MS-SQL list in registry
- Master database in Sybase
- Some backup products support All for databases
- Remember to write standardize script with
parameters to backup databases.
33Incremental backups - NetBackup
- Create staggered monthly full backups using
calendar-based scheduling - Create staggered weekly cumulative incrementals
using CBS - Create daily incremental backups using frequency
based backups - (Check Allow after run day.)
- Delete window from previous day for CBS
34Incremental backups - NetWorker
- Do not use the Default schedule!
- Create 28 schedules with a monthly full, weekly
level 1, and daily incremental, name them after
the full day - Do not specify a schedule for the Group
- Assign the 28 schedules evenly across all clients
based on size
35Standardization NetWorker
- Use All saveset entry
- To exclude files, use standard directives for all
clients
36Standardization - NetBackup
- Use ALL_LOCAL_DRIVES
- Non-Windows clients - Use standard exclude list
and push out from master using bpgp - Windows clients Use standard exclude list and
push out from master using bpgetconfig M and
bpsetconfig h
37Backup Reporting - NetBackup
- Watch activity and device monitors
- bperror
- bpdbjobs -report
- bpdbjobs report all_columns
- /usr/openv/netbackup/logs
- /usr/openv/logs
- /usr/openv/volmgr/logs
38Backup Reporting NetWorker
- Watch nwadmin screens
- mminfo
- nsrinfo
- mmlocate
- nsrmm
- /nsr/logs
39Disk-to-disk Backup - NetWorker
- If using regular disk, use file type device
- Disk backup extra cost with options
- If using virtual tape library, treat it like a
tape library - Use cloning to duplicate disk-based backups to
tape and send them off-site
40Disk-to-disk Backup - NetBackup
- If using regular disk, use disk-based storage
unit - (No extra cost for disk storage units!)
- If using virtual tape library, treat it like a
tape library - Use vault to duplicate disk-based backups to tape
and send them off-site
41What about my SAN and NAS?
42SAN LAN-free, Client-free, and Server-free
backupNAS NDMP filer to self, filer to filer,
filer to server, server to filer
43LAN-free backups
- How does this work?
- SCSI Reserve/Release
- Third-party queuing system
- Levels of drive sharing
- Restores
44How client-free backups work
Backup transaction logs to disk
Establish backup mirror
Split backup mirror and back it up
45How client-free recoveries work
Restore backup mirror from tape
Restore primary mirror from backup mirror
Replay transaction logs from disk
46Server-free backups
- Server directs client to take a copy-on-write
snapshot - Client and server record block and file
associations - Server sends XCOPY request to SAN
47Server-less Restores
- Changing block locations
- Image levelrestores
- File levelrestores
48NDMP Configurations
- Filer to self
- Filer to filer
- Filer to server
- Server to filer
49Using NDMP
- Level of functionality depends on the DMA you
choose - Robotic Support
- Filer to Library Support
- Filer to Server Support
- Direct access restore support
50Resources
51Resources
- Directories of products to help you make a better
backup system - http//www.storagemountain.com
- Send questions to
- curtis_at_thestoragegroup.com