Title: Intelligent Archiving Strategies: Toward ILM
1Intelligent Archiving Strategies Toward ILM
- Arun Taneja, Founder and Consulting Analyst,
Taneja Group - Alex Gorbansky, Senior Analyst, Taneja Group
2Agenda
- A Bit of Historical Perspective
- Why Archive?
- What to Archive?
- The ILM Panacea
- Developing an Operational Archival Strategy
- Key Considerations
- Representative Vendors and Solutions
- Conclusions
3Archival ? Backup
- BACKUP
- Copying production data to an alternative medium
for restorability in the event of data loss,
corruption, or unavailability. -
- ARCHIVAL
- Retention of historical data for future access
for business reasons such as audits, customer
issues, or litigation. -
4Some History On Archiving
- American Historical
- Association
- Archival standards
- Marriages
- Businesses
- American colonists
- Births
- Marriages
- Businesses
- Ancient Egypt
- Library of Alexandria
- Engravings
3000 BCE
Middle Ages
1600s
1789
1884
- Shift from Feudalism
- To Nation State
- Records
- Property rights
- French Revolution
- Property records
5Archival Business Drivers Today
REGULATORY COMPLIANCE REQUIREMENTS
EXPLOSIVE DATA GROWTH
APPLICATION PERFORMANCE DEGRADATION
RISING COSTS
6What to Archive?
- Structured Data
- ERP/CRM DB tiers
- Business transactions
- Unstructured Data
- Documents
- X-Rays
- Check Images
- Voice recording
- Semi-structured Data
- Email
- Instant Messaging
7ILMShmILM
- ILM is an abstract framework for describing the
processes and technology used to manage
information throughout its life according to its
business value. - ILM is NOT the panacea for your storage
management challenges.
8Archival is a key component of what vendors are
calling ILM
Applications ERP, CRM, Email, Call Recording,
Image Access
Application Data Structured, Unstructured,
Semi-Structured
Policies and Rules
Business Context
Referential Integrity
Regulatory Compliance
Data Movement Technologies
Snapshots
HSM
Replication
Backup
Archival
Storage Infrastructure Tiers
Primary
Secondary
Tertiary
9Developing an Archival Strategy
- PLAN
- When/How
- Data Classification
- Requirements
4. REPORT TEST
2. DESIGN
3. IMPLEMENT
10Why Plan and When to Start
- Upfront Planning will Result in Significant
Benefits in Future Phases. - Develop an Archival Strategy as part of your
application design and development process. - Engage Key Stakeholders
- Application Owners
- Business Decision Makers Compliance Officers,
Legal - Identify Key Archival Business Drivers
- Regulatory Compliance
- Other Data Growth, Increasing Costs, Poor
Performance
11The Data Classification Puzzle
- Assess the application data in your shop
according to the following categories - Structured database
- Unstructured files, videos, images
- Semi-structured email
- Identify specific data sets impacted by
regulatory compliance - Examples Email, Medical Records, Call Recordings
12Requirements Definition
- Engage Application Owners
- Compliance not the ONLY archival driver
- Separate requirements processes for applications
impacted by compliance.
- Compliance-specific
- Retention period
- Media characteristics
- Data restorability rates
- Access control policies
- Data availability/DR
- General archival
- Data Access Patterns
- Restore time requirements
- Application performance
- Cost structure
- Access control policies
- Data availability/DR
13Taming the Compliance Monster
- Understand the Regulations Significant Variance
by Industry - Assess/Communicate Requirements to Key Business
Stakeholders - Judge Products for Yourself Just because a
vendor says a solution is Compliant doesnt
make it so. - Stay abreast of changes in regulatory mandates.
14Defining Key Archival Metrics
- Archive Distribution Percentages Across
- Online Disk, Object-based storage
- Near-line Optical, Tape (local)
- Off-line Off-site vaults
- Number of data copies
- Local
- Remote
15Designing an Archival Solution
- Requires an application specific assessment
look for commonality in application requirements - Wholly enterprise-wide strategies will be
difficult to build and sustain - Evaluate alternative solutions based on
application requirements and metrics
16Dont Ignore the Organizational Dynamics
- Archival Touches Multiple Organizations
- IT Applications
- IT Infrastructure
- Legal
- Users
- Consequences of mistakes are enormous
- Fines
- Litigation
- Consider organizing a cross-functional team led
by an archival champion with a combination of
technical and business expertise
17Comprehensive Application Assessment
- Data Classification Exercise
- Data Set Size and Historical and Predicted Data
Growth Rates based on business drivers - Is Regulatory Compliance an Issue?
- Data Valuation over Time
- Access patterns of data of 90 days old and
beyond. - Cost of data loss
- Going it alone can be difficult
- Available resources
- Services organizations GlassHouse, Accenture,
EDS, Storage Vendor - Application Management Tools File-Level SRM,
Precise - Budgetary Requirements
18Components of the Archival Stack
Application Data
Application Specific Module Discovery and
analysis of data assets Business rules and
policies definitions Identification and movement
of specific data to appropriate storage
medium Management, indexing of data and
metadata Access control mechanism
Management Control
Data Flow
Storage Infrastructure Physical archive
repository Data Preservation and
Protection Indexing Technologies for Retrieval
Physical Repository
19Structured Data Archival Challenges to Investigate
- ERP deployments are still very nascent
- Preventing application downtime during archival
- Preserving referential data integrity
- Archival of core data and associated data in
other tables - Enforcing single read-only state across related
data - Delivering transparent access to
archived/combined data via native app UI - Maintaining performance of remote queries and
union views. - Update process
- Restate vs. entire reload
20Unstructured Data Considerations
- Scalability
- Sustained performance with data growth
- Hierarchical file-systems limited at large scales
- Content Access and Visibility
- Meta data use to intelligently manage and
maintain archive addresses traditional file
system limitations - Scalability of Index (Content addresses)
21Email Archival Challenges
- Stringent regulations SEC Rule 17A-4
- Non-rewriteable, non-reusable media
- Verification of writes
- Serialize units of media
- Solution Requirements
- Server-based capture
- Support for multiple distributed Email Servers
22Meta Data Holds Real Value
- Object Age and creation date
- Object Change History
- Associated application/users
- Access control
- Priority/Criticality
- Data Access/Frequency
Meta Data is data about data
- Digital asset tied to specific infrastructure
- No value outside of infrastructure context
Traditional File Systems
- Self-describing attributes for digital asset
- Enables powerful policy-based data movement
applications
Object-based systems
23Choosing the Right Storage Medium
1 Week
1 Month
3 Months
18 30 Years
1 Year
Life Expectancy
Recovery Time
Minutes
Hours to Days
lt Seconds
24Key Considerations for Storage Media
- Cost
- Access time
- Application access method
- NFS/CIFS
- Application-specific API
- Reliability/Availability
- Data Preservation Capability
- Scalability
- Archival solution integration
25Storage Media Considerations
Pros Cons
Primary Storage No risk of data loss Instantaneous access Exorbitant costs Performance degradation
Secondary Storage (SATA) Cost effective Solid access time Integration Enforcing preservation Management
Object Storage Fit for large unstructured files Elimination of data redundancy WORM-like preservation Price premium Performance scalability with index growth
Tape Most cost effective Removable Integrated WORM Access time Reliability
26Shifting towards an On-line Model
Tape
Primary
Object Storage
SATA
27Representative Vendors
Structured Data Email Unstructured
Archival Solutions OuterBay, Princeton Softech, Applimation, Ixos, Legato, KVS, Assentor Documentum, FileNet, NICE
SATA Object Tape
Storage Platforms CLARiiON, STK, IBM, Nexsan COPAN, Centera, Archivas, Permabit, DCT STK, Quantum, ADIC, IBM
Start with your application vendor
28Trust But Verify
- Develop processes to periodically access
historical data to test - Data integrity
- Access time
- Manage capacity growth using vendor-supplied
reporting tools
29Summary
- Archival is not backup and is not just about
compliance - Successful strategy requires application-centric
approach - Engage with key corporate stakeholders to define
requirements and select solutions - Look for automated and interoperable software and
hardware modules. - Be Paranoid!
30Thank you!
- Arun Taneja
- arunt_at_tanejagroup.com
- Alex Gorbansky
- alex_at_tanejagroup.com