Title: DoubleTake Product Overview
1Double-Take Product Overview
- Jason L. Buffington
- Director, Business Continuity
- NSI SOFTWARE
Warning No part of this document may be
reproduced or transmitted in any form or by any
means, electronic, or mechanical, for any reason,
without the express written permission of NSI
Software. The information in this document is
subject to change without notice. Companies,
names and data used in examples herein are
hypothetical and/or fictitious unless otherwise
stated. Note Product names mentioned herein
may be trademarks and/or registered trademarks of
their respective companies.
2Agenda
- Introduction to NSI Software
- .
-
- The Technology behind Replication
- Double-Take Business Solutions
- Real World Examples
- Q A
3The dominant provider of continuous data
protection availability solutions for
business applications
4NSI Software About Us
- Leading Provider of Data Replication and High
Availability - Software and Services
- Founded in 1991
- Over 200 employees in three offices Hoboken, NJ,
Indianapolis, IN and Southboro, MA - Award-winning products Double-Take and
GeoCluster - Over 26 patented technologies
- Comprehensive Professional Services Offerings
- Prestigious Microsoft Certifications
- Certified for Windows Standard, Enterprise and
Datacenter Server on Windows 2000 and 2003 - Gold Certified Partner
5NSI Software Milestones
-
- 7 Million Strategic Investment
- 17 Consecutive quarters of XX growth
- 50,000th license sold
- 15 Million Strategic Investment
- IBM, HP and Dell standardize on DT for Win.
Storage - NSI enters NAS market
- Microsoft Gold Partner
- Microsoft Server, Adv/Enterprise and Datacenter
certified - Launched Double-Take 4.x and GeoCluster
- Established Indiana Research Facility
- 37 Million Strategic Investment
- Awarded US Patent 5,974,563
2005
2004
2003
2002
2001
2000
1999
1998
1995
1991
6Strong Market Acceptance
The De Facto Standard in Data Replication
- Large Customer Adoption
- Over 50,000 product licenses
- Over 200 F500 customers
- More than 36 revenue growth year (04) over year
(03) - Q1 04 Revenue vs. Q1 03 Revenue 31.0
- Q2 04 Revenue vs. Q2 03 Revenue 36.2
- Q3 04 Revenue vs. Q3 03 Revenue 25.5
- Q4 04 Revenue vs. Q4 03 Revenue 52.1
- (Marks 17th consecutive quarter of growth)
- Extensive worldwide distribution channel
- Strong demand from strategic partners
- Dell, HP, IBM, Microsoft and SunGard
7Award Winning Solutions
- Real-time data protection
- Automatic server failover
- Disaster recovery - LAN and WAN environments
- Windows NT4, 2000, WPNAS and 2003
- (Server, Enterprise / Advanced Server, and
DataCenter)
- Provides redundancy of MSCS storage
- Allows MSCS cluster to be separated over IP
- Based on DoubleTake replication engine
- Windows NT 4, 2000, WPNAS and 2003
- (Enterprise / Advanced Server / DataCenter)
8 Production (Source)
Servers
Testing or Migration Server
High Availability (Target) Server
Off-Site Disaster Recovery (DR) Server
Optional Centralized Tape Backup
Direct offsite connection or two-stage
connection through local HA server
9MSCS with
EXCH03 (10.1.1.64) Running Exchange Services on
Drive-X
SQL02 (10.1.1.65) Running SQL Services on Drive-Y
WWW (10.1.1.66) Running WWW FTP Services on
Drive-Z
C
C
CLUSTER01 (10.1.1.63) Arbitration Path
\\FS1\Quorum Arbitration Path \\FS2\Quorum Arbitra
tion Path \\FS3\Quorum
10So, here is your Windows Network
11How Replication ensures data integrity
12Server Model
- Application Layer
- Exchange services
- Databases engine
- File sharing
- Web applications
Applications
13Server Model
Applications
Operating System
- Windows Operating System
- Windows NT4
- Windows 2000
- Windows-Powered NAS
- Windows 2003
- Including Storage Server
14Server Model
Applications
Operating System
- File System or Cache
- An area of memory for disk transactions to be
stored before being written to disk
File System
15Server Model
- Disk/Hardware layer
- Including disk drivers, disk controller
- and the actual hard drives
16Server Model with Replication Path
Applications
Replication
Operating System
File System
Hardware Layer
17How does it REALLY work ?
18How Replication really worksfrom an EXCHANGE
perspective
19How Replication really worksfrom an EXCHANGE
perspective
- Mail client sends to Exchange server
20How Replication really worksfrom an EXCHANGE
perspective
IMMEDIATE (to DISK) for LOG dB / Recipient
/ MessageTo/Cc/Opt/XxXxXx
TO DATA-STORE in MEMORY dB / Recipient
/ MessageTo/Cc/Opt/XxXxXx
- Mail client sends to Exchange server
- Server updates LOG and DATA
21How Replication really worksfrom an EXCHANGE
perspective
IMMEDIATE (to DISK) for LOG dB / Recipient
/ MessageTo/Cc/Opt/XxXxXx
TO DATA-STORE in MEMORY dB / Recipient
/ MessageTo/Cc/Opt/XxXxXx
- Mail client sends to Exchange server
(DATA PAGES eventually paged to disk)
- Server updates LOG and DATA
TWO FILE WRITES
one to LOG.CHK one to PRIV.EDB
22How Replication really worksfrom an OPERATING
SYSTEM perspective
IMMEDIATE (to DISK) for LOG dB / Recipient
/ MessageTo/Cc/Opt/XxXxXx
File Operation 326 Path d\EXCHSRVR\MDBDATA\LOG
S File LOGx.CHK Operation Write Start
1720 bytes Length 42 bytes Data
To/Cc/Opt/XxXxXx
TO DATA-STORE in MEMORY dB / Recipient
/ MessageTo/Cc/Opt/XxXxXx
(DATA PAGES eventually paged to disk)
File Operation 487 Path d\EXCHSRVR\MDBDATA\ Fi
le PRIV.EDB Operation Write Start 92324
bytes Length 42 bytes Data To/Cc/Opt/XxXxXx
TWO FILE WRITES
one to LOG.CHK one to PRIV.EDB
23How Replication really worksfrom an OPERATING
SYSTEM perspective
File Operation 326 Path d\EXCHSRVR\MDBDATA\LOG
S File LOGx.CHK Operation Write Start
1720 bytes Length 42 bytes Data
To/Cc/Opt/XxXxXx
24How Replication really works...from a
DOUBLE-TAKE perspective
File Operation 326 Path d\EXCHSRVR\MDBDATA\LOG
S File LOGx.CHK Operation Write Start
1720 bytes Length 42 bytes Data
To/Cc/Opt/XxXxXx
File Op 326 Path File Op Start Length DATA
- DBLHOOK (driver) asks
- Is the Source connected to a Target ?
- YES
- Is it a Write or Read or ???
- WRITEYES
- Is the file or directory supposed to
- be replicated? YES In the Rep Set
25How Replication really worksfrom a
DOUBLE-TAKE perspective
File 326 Path File Op Start Length DATA
File 327 Path File Op Start Length DATA
File Op 487 Path File Op Start Length DATA
File 328 Path File Op Start Length DATA
. . .
487
328
327
326
26How Replication really worksfrom a
DOUBLE-TAKE perspective
File Op 487 Path File Op Start Length DATA
. . . . . . . . . .
487
. .
328
328
327
327
326
326
27How Replication really worksfrom a
DOUBLE-TAKE perspective
Its the same data in the same order
. . . . . . . . . .
With one optional exception
File Operation 326 Path d\EXCHSRVR\MDBDATA\LOG
S File LOGx.CHK Operation Write Start
1720 bytes Length 42 bytes Data
To/Cc/Opt/XxXxXx
T\FS1-D\EXCHSRVR\LOGS\
328
327
326
28How Replication really worksfrom an
DOUBLE-TAKE perspective
Production Cache Not Lost Writes Only - not
Reads Data selectable at a File or Directory
Level Transaction-based Replication
Applications
Operating System
File System
Hardware Layer
29Solution Points
Improves ROI by extending life of server
No Hidden Costs (no agents or options)
30Solution Points for Partners
Application Independent
Doubles Storage
Improves ROI by extending life of server
Professional Services (before, during and after)
No Hidden Costs (no agents or options)
Infrastructure Equipment
Hardware Independent
Compelling Demo
31Business Solutions
32HIGH AVAILABILITY
- For the purposes of this presentation,
- High Availability is defined as a method by
which user data and/or applications are protected
and continue to be available to the user
community in order to allow the user community to
remain productive. - This level of survivability assumes and
requires that the remainder of the computing
environment is functional meaning that the
users workstations continue to have power and
connectivity to the server resources and the
network topology has not been significantly
altered
33High Availability
Production (source) Servers
FS1 DB
10.9.9.1
WWW System Services started
FS2
10.9.9.2
dB System Services started
Redundant (target) Server
FS3 WWW
10.9.9.3
FSDT
10.9.9.252
FS4
10.9.9.4
FS5 F/P
10.9.9.5
34DISASTER RECOVERY
- For the purposes of this presentation,
- Disaster Recovery is defined as a method by which
network information is protected and continues to
be available in the event that the computing
environment is critically impacted. - This level of survivability makes no assumptions
and/or requirements for other surviving
components.
35DISASTER RECOVERY
Production
FS1
FS2
Disaster Recovery
FS3
FS4
Off-Site Storage
FS5
36BACKUP RESTORE
- For the purposes of this presentation,
- There is no such thing as Backup
- The idea of backing up
- or writing data to tape/optical is simply the
- PREPARATION FOR RESTORE
- it is also that series of tasks that make
auditors happy.
37Backup RecoveryWhy not backup at 3PM ?
FS1
- Open Files on all servers
- high CPU on backup server
- high network during backup
- I/O results in system-crash
FS2
FS3
FS4
FS5
Tape Backup
38Backup RecoveryWhy not backup at 3PM ?
FS1 Dallas
- Open Files on all servers
- high CPU on backup server
- high network during backup
- I/O results in system-crash
- Every office backing up itself (managed by
non-I/S personnel)
FS2 NewYork
FS3 Chicago
FS4 Seattle
FS5 LosAngeles
Tape Backup
- Off-Site Courier Services
39Enhanced Backup
- no Open Files - without agents
- no CPU issues
- no network bandwidth limits
- no stability concerns
- truly Centralized Backup
FS1
FS2
FS3
FS4
Off-Site Storage
Tape Backup
FS5
40Gartner Group
- Strategic Planning Assumption 75 percent of
large enterprises will combine data replication
and tape technology for rapid application
recovery (0.7 probability). -
- Bottom Line Organizations that are running 24x7
operations and are confronting shrinking backup
windows, or that foresee an inability to meet
service-level agreements, should plan and budget
for deploying data replication technologies. - Gartner Designing to Restore From Disk Backup
Futures
41Backup with In-Band Processing
- Production Server checkpoints databases
Production Source
42Backup with In-Band Processing
- Production Server checkpoints databases
- Flag inserted into DT queue
43Backup with In-Band Processing
- Production Server checkpoints databases
- Flag inserted into DT queue
- When Target receives Flag
- operation as completed on Target
so Script executes
44Backup with In-Band Processing
so Script executes
45Combining Replication and VSS Snapshots
Double-Take Replication
REMOTE SITE running Windows 2003
DATA CENTER running Windows 2003
46and SANs
- SAN as replication transport
- Single Server (S to T)
- IP over Fibre Channel
- SAN to SAN
- Remote Data Protection
- Consolidation of storage onto SAN
47Cluster Disaster Recovery
MSCS Cluster Nodes
Double-Take
Replication from Node 1
Replication from Node 2
48Cluster Disaster Recovery
MSCS Cluster Nodes
Double-Take
Replication from Node 1
Replication from Node 2
49Migration Projects
Monday set up new server Tuesday mirroring /
replication Thursday/Friday Move users as
needed Weekend Whatever you want
Current Server Windows NT4 Single
CPU Local disk
NEW Server Windows 2000/2003
Multiple CPUs SAN or WPNAS
50 and any time that you need multiple copies of
your active data
Y2K / Migration / Testing Tear-Off / Snapshot
Production Server or Cluster
Remote Site Data Distribution
Balanced Web Server Farm
IP
SAN Replication
51Customer Solution Scenarios
52Scenario 1 H/A, D/R and B/U
SITE 1 BEFORE
FS1 dB
FS2 eMail
FS3 file
FS4 file
FS5 file
Tape Backup
53Scenario 1 H/A, D/R and B/U
SITE 1 AFTER
FS1 dB
FS2 eMail
FS3 file
FS4 file
FS5 file
54Scenario 1 H/A, D/R and B/U
SITE 1 AFTER
- App Server failover to servers
- File Server failover to NAS
- D/R and Backup at NAS
55Scenario 1 H/A, D/R and B/U
TOTAL SOLUTION 32 DoubleTakes 3 NAS devices 1
tape solution
56Scenario 2 Widely Distributed Environment
- 100 remote locations
- Three target data centers
57Scenario 2 Widely Distributed Environment
- Each Source Site as at least 4 servers (x 33)
- approximately 132 sources per target data center
58Scenario 2 Widely Distributed Environment
- Each Data Center with 10 Target servers
59Scenario 2 Widely Distributed Environment
33 Sites x 4 servers each 132 servers to one
target Data Center with 10 target
servers Approximately 13 sources per target
60http//www.NSISOFTWARE.com
61Supplemental Reference slides when discussing
regulatory issues (or stats)
62High Availability - Statistics
- 59 of Fortune 5000 companies experience a
minimum of 1.6 hours of downtime per week. - This includes Software Crashes, Required System
Reboots, and Normal Maintenance. - -Dunn Bradstreet
- In one 2000 study, ONE out of FOUR organizations
had a significant disruption in their computer
systems. - - Arcus Data Security
- 33 less than four hours
- 18 five to eight hours
- 20 nine to 24 hours
- 24 more than 24 hours
- The cost of lost productivity can be calculated
using the average salarybenefits figure of 36 /
hour / person - (not including lost revenue opportunities or
damage to corporate credibility) - - IDC .
63Disaster Recovery - Statistics
- 40 of all companies that experience a major
disaster will go out of business if they cannot
gain access to their data within 24 hours. - Gartner
- Organizations that lost Records in a Fire
- 44 never reopened for business.
- 30 of the rest didnt stay in business for 3
years. - - Assoc. of Records Managers and Admin.
- 2001 Market Projections for Disaster Recovery was
11 Billion and climbing. - - Contingency Planning Research
64Healthcare HIPAASecurity Section
- 142.308 Security standard.
- (a) Administrative procedures to guard data
integrity, confidentiality, and availability - (3) A contingency plan, a routinely updated plan
for responding to a system emergency, that
includes performing backups, preparing critical
facilities that can be used to facilitate
continuity of operations in the event of an
emergency, and recovering from a disaster. The
plan must include all of the following
implementation features - (i) An applications and data criticality
analysis (an entitys formal assessment of the
sensitivity, vulnerabilities, and security of its
programs and information it receives,
manipulates, stores, and/or transmits). - (ii) Data backup plan (a documented and routinely
updated plan to create and maintain, for a
specific period of time, retrievable exact copies
of information). - (iii) A disaster recovery plan (the part of an
overall contingency plan that contains a process
enabling an enterprise to restore any loss of
data in the event of fire, vandalism, natural
disaster, or system failure). - (iv) Emergency mode operation plan (the part of
an overall contingency plan that contains a
process enabling an enterprise to continue to
operate in the event of fire, vandalism, natural
disaster, or system failure). - (v) Testing and revision procedures (the
documented process of periodic testing of written
contingency plans to discover weaknesses and the
subsequent process of revising the documentation,
if necessary).
65SEC and NYSE
- Securities Exchange Act Rule 17a-3 is the
principal rule which sets forth the books and
records required to be made by all
broker-dealers. - In addition, Rule 17a-4 establishes the time
periods and the manner in which such books and
records must be preserved and made accessible. - These rules and NYSE rules relating to record
maintenance, such as Rule 440 (Books and
Records), apply to all members and member
organizations - including those acting solely as
Floor brokers and that do not conduct business
with public customers. - Rule 446 requires a published Business Continuity
Plan. - Members and member organizations have a
continuing responsibility to make and preserve
records, which are sufficient to satisfy the
requirements of the above rules and to
substantiate reports made to the Exchange. - ----------------
- NYSE Rule 440 and the Securities Exchange Act
Rule 17a-4 apply to the electronic logs
maintained in lieu of paper order tickets and
reports of execution, which relate to the
members business. In addition, members and
member organizations must ensure that all
communications whether electronic or otherwise,
including but not limited to e-mails, instant
messages, and similar communication devices that
relate to the firms business as such must be
maintained and retained in compliance with NYSE
Rule 440 and SEA Rules 17a-3 and 17a-4.
66SEC / Federal Reserve / TreasuryInteragency
Paper on Strengthening US Financial System
- TECHNOLOGY RECOMMENDATIONS
- The business continuity planning process should
take into consideration improvements in
technology and business processes supporting
back-up arrangements and the need to ensure
greater resilience in the event of a wide-scale
disruption. - Core clearing and settlement organizations that
use synchronous back-up facilities or whose
back-up sites depend primarily on the same labor
pool as the primary site should address the risk
that a wide-scale disruption could impact either
or both of the sites and their labor pool. Such
organizations should establish even more distant
back-up arrangements that can recover and resume
critical operations within the business day on
which the disruption occurs. - By the end of 2004, plans should provide for
back-up facilities that are well outside of the
current synchronous range that can meet
within-the-business-day recovery targets. - There is general consensus that the
end-of-business-day recovery objective is
achievable for firms that play significant roles
in critical markets, although many state that
this is possible only if firms are able to
utilize synchronous data storage technologies,
which can limit the extent of geographic
separation between primary and back-up sites. A
number of commenters note that a recovery time
objective of four hours is unrealistic unless
core clearing and settlement organizations and
the telecommunications infrastructure are
operating. Some commenters suggest that recovery
and resumption time objectives should vary by
type of market. Other commenters note that
further guidance on the definitions of an "event"
and "end-of-business day" is needed to help
ensure meaningful recovery and resumption time
objectives.
675015.2-STD for Federal Agencies (and Contractors)
- C2.2.9. System Management Requirements. The
following functions are typically provided by the
operating system or by a database management
system. These functions are also considered
requirements to ensure the integrity and
protection of organizational records. They shall
be implemented as part of the overall records
management system even though they may be
performed externally to an RMA. - C2.2.9.1. Backup of Stored Records. The RMA
system shall provide the capability to
automatically create backup or redundant copies
of the records and their metadata (see references
(z), (ag) and (am)). - C2.2.9.2. Storage of Backup Copies. The method
used to back up RMA database files shall provide
copies of the records and their metadata that can
be stored off-line and at separate location(s) to
safeguard against loss due to system failure,
operator error, natural disaster, or willful
destruction (see 36 CFR 1234.30, reference (at)). - C2.2.9.3. Recovery/Rollback Capability.
Following any system failure, the backup and
recovery procedures provided by the system shall -
- C2.2.9.3.1. Ensure data integrity by providing
the capability to compile updates (records,
metadata, and any other information required to
access the records) to RMAs. -
- C2.2.9.3.2. Ensure these updates are reflected
in RMA files, and ensuring that any partial
updates to RMA files are separately identified.
Also, any user whose updates are incompletely
recovered, shall, upon next use of the
application, be notified that a recovery has been
attempted. RMAs shall also provide the option to
continue processing using all in-progress data
not reflected in RMA files (see references (z)
and (am)). - C2.2.9.4. Rebuild Capability. The system shall
provide the capability to rebuild from any backup
copy, using the backup copy and all subsequent
system audit trails (see reference (z)).