Title: High Performance Enterprise Data Propagation
1High PerformanceEnterprise Data Propagation
2BMC Company Profile
- Established in 1980
- Leader in Application Management
- Estimated FY2000 Revenues of 1.8B
- Over 6,000 Employees
- Development Labs in Austin (TX), Conyers (GA),
Houston, San Jose, Sunnyvale (CA), Waltham (MA)
Germany, Israel, Singapore - Market Coverage in Over 50 Countries
- Member of the SP 500
3BMC Software e-Business Availability
- Provides application management solutions that
ensure the availability, performance, and
recovery of business-critical applications. - We call this application service assurance and it
means that the applications companies and their
customers rely on will be there when they need
them. - e-vailability - We Guarantee Our Solutions!
4Enterprise Data Propagation (EDP)Requirement For
All Enterprises
- Need to synchronize data between legacy systems
and distributed relational databases for - Data warehousing, operational data stores, data
mining - e-Business applications access to legacy data
- Enterprise application integration
- Distributed enterprises, ERP solutions,
Acquisitions
70 of corporate data in IMS, VSAM, DB2
Need high performance solutions Need near real
time solutions
5Data Propagation - Strategies For Synchronizing
Multiple Copies of Data
6Key Challenges Implementing a Data Warehouse
- Data Management Review Survey
- Business rule analysis
- Managing End User Expectation
- Business data modeling
- Reliability and integrity of data
- Data acquisition
- Meta Data management
- Managing Management Expectation
- Database performance
7Data Warehouse Implementations
- For Customers With
- Large operational databases
- High transaction rates
- 24x7 operations requirements
- Critical Management Issues
- Availability of operational systems
- Performance of operational transactions
- Maintaining service levels
- Increasing volumes of data
- Time required to load and refresh data warehouse
- Quality, currency accuracy of decision making
data
8Building Data Warehouses A Perspective
9Building Data Warehouses A Perspective
10Building Data Warehouses A Perspective
Data Warehouse
Integration Area
Query Tools
Brio, Bus. Objects COGNOS Microstrategy
11Building Data Warehouses A Perspective
BMC Solution
Data Warehouse
Integration Area
Query Tools
Brio, Bus. Objects COGNOS Microstrategy
12Change Data Propagation A Perspective
- Change Data Propagation Is Preferred When
- Databases are large and bulk move would take too
long - Batch window limitations
- Database availability limitations
- Support for 24 x 7 is a requirement of
operational application - Minimum latency Near-real-time is required in
target database! - Currency of information in target database is
important - Small percentage of a large database has changed
- Need to reduce network traffic by transmitting
only data changes
Target
Source
13Transaction BasedChange Data Propagation
- Synchronous Data Propagation
- Original update waits until all targets are
updated - Single, global transaction with multi-site,
coordinated commit processing - Asynchronous Data Propagation
- Propagation of updates occurs asynchronous to
originating transaction - Minimizes resource consumption at source
- Minimizes impact on source transaction response
times
14Synchronous vs AsynchronousChange Data
Propagation
Synchronous 2 Phased Commit
Asynchronous Data Propagation
Source transaction completes when all databases
updated
Source transaction does not wait for target
databases to be updated
- Advantages
- Real time propagation
- All sites always synchronized
- Disadvantages
- Transaction response time
- Data availability impact
- System resiliency
- Usually not practical
- Advantages
- Minimum performance impact
- Availability
- Autonomy
- Recoverability
- Disadvantages
- target locations updates may be delayed
- All sites not always synchronized
15Asynchronous Change Capture Implementation
Considerations
- Trigger Based
- Triggers used to capture changes to database
records - Incremental updates collected in staging tables
- Significant resource consumption for triggers and
logging - Typically low volume applications (lt 20
transactions/second) - Log Exit Based
- Increased logging in operational environment
- Increased response times for source transactions
- Increased resource consumption
- Log management issues
- Log Post Process Based
- Increased logging in operational environment
- Log management issues
- Long latency interval can not support near real
time
16Enterprise Data Propagation (EDP) The BMC
Solution
- A Data Propagation Management System
- A single point of access for managing Legacy data
propagation across the enterprise - Efficient change capture
- Basic data transformation
- High performance data movement
- High performance utilities
- Common look and feel
- Integrated transformations and mappings
- Integrated recovery/restart
17ChangeDataMoveProduct Positioning
- Positioning
- ChangeDataMove is a high performance, efficient,
change data propagation solution, which captures
changes made to IMS, Fast Path, VSAM, and DB2
databases, and propagates those changes to the
most prevalent relational databases. - What It Does
- Transaction-based data propagation
- Supports high volume production applications with
hundreds of transactions per second - Supports near real-time as well as scheduled
data propagation - Advantages
- A data propagation system (complete solution vs a
point product) - Highly efficient change capture does not impact
applications - Only solution for IMS, FastPath and VSAM that
does not require logging - Optionally integrated with DataMove for bulk data
movement
18Change Data Propagation for IMS and VSAM
- Synchronous Change Capture
- Transparent high performance change capture
- Minimum impact on source system logging, CPU
user response time - Data is available immediately for asynchronous
propagation
- Asynchronous Data Propagation
- Data Propagated Within Context of Original
Transaction - Updates applied in proper sequence
- Inter and intra-table consistency
- Source and target(s) consistent within
transaction boundaries
19IMS Change Capture
- Resides within the IMS environment
- Captures DL/I calls as they occur
- Supports IMS/TM (MPP,BMP), Fast Path, CICS DBCTL,
Batch DL/I - Commits updates at transaction or job (batch) end
Based on BMC Softwares CHANGE RECORDING FACILITY
BMC Apply
EDP Logger
LRP
TNR
OEM Apply
20CICS/VSAM Change Capture
- Captures changes at each Get, Put Erase request
- Utilizes CICS TRUE, File, and Re-sync exits
- Resides as functional part of CICS address space
- Participates in two phase commit with CICS
transaction - Updates are committed when transaction commits
BMC Apply
CICS Subsystem
User Application
EDP Logger
LRP
TNR
ECCR
OEM Apply
Database
21VSAM Batch Change Capture
- Journad exit dynamically activated
- ECCR resides within the batch address space
- UOW is complete when application closes VSAM file
22DB2 MVS Change Capture
- Requires DB2 change data capture be activated
- Reads log records via DB2 IFI, external
decompression - Maintains multiple versions of schema
23The Transformation Process
- Transforms IMS, Fast Path and VSAM data to
relational formats - Hierarchical structures to relational structures
- Converts non-relational data types to relational
- Uses relational DBMS catalog information
- Uses copy libraries and IMS database descriptors
- Automatically handles Dates, Times, Data Types
- Repeating groups, Redefined records
- Customizable through user exits
24Possible Target Keys
- To allow resulting target rows to be unique
- Replication Key (REPKEY)
- This key will make the target row unique
- For IMS it is the full concatenated key or
segments RBA - Ancestor Keys
- If REPKEY is a composite key (I.e. IMS
concatenated key) each level is available to be
used as the key of the target row - Sequential number
- If a single input segment or record creates
multiple output rows, a sequential numeric column
can be generated. - Any field in the input segment or record
25Transforming Cobol Structures
- Repeating Groups
- all repeated fields to a single target column
- As individual rows in the same or a different
table - Update results in set of deletes and inserts for
target rows - Redefined Records
- assigned unique names and schema definitions
- Record identification exit identifies record
types - Schema applied to segment or record based on
redefined record type - Redefined records can be propagated to same or
different targets
26High Performance Transport Apply
- Data is blocked, compressed and encrypted
- Multi-threaded apply tasks for increased
performance
EDP Apply
Send
DB2 Dynamic Memory Staging Queue
Receive
EDP Apply
T R A N S P O R T
T R A N S P O R T
EDP Apply
TCP/IP
EDP Apply
Oracle Dynamic Memory Queue
EDP Apply
EDP Apply
27Automated Schema Replication
- Reduce administration costs by automating the
creation of target tables from IMS, VSAM, and DB2
source schema
DB2
DBD Copybook
SchemaMove
Copybook
DB2 Catalog
28Bulk Data Propagation
- Bulk move is usually simpler and easier to
implement - Needed to initially create or to refresh a target
database - Bulk move is the preferred solution when
- Data volumes are not large and the move can be
performed within time constraints - Database availability is not a concern
(source/target) - Network volumes and network overhead are not
issues - Currency of information in target database is not
a concern - Change data propagation cannot handle the volumes
29Bulk Data Movement DB2 to OracleThe Traditional
Approach
Time
MVS Host
35
DB2 Extract
DB2
DB2 Unload 20 min.
Gateway
13
File Transfer 7 min.
TCP/IP
UNIX Server
52
Gateway
Oracle SQL Load 28 min.
Oracle Loader
Oracle
Total Time 55 min.
30Bulk Data Movement DB2 to OracleParallel Unload
Parallel load
Time
MVS Host
DB2 Extract
DB2
Parallel Unload 7 min
Gateway
File Transfer 7 min.
TCP/IP
UNIX Server
Gateway
Oracle SQL Load 16M
Oracle Loader
Oracle
Total Time 30 min.
31Bulk Data Movement DB2 to OracleParallel
Unload/load Piping
MVS Host
DB2 Extract
DB2
Parallel Unload
Gateway
PIPING
TCP/IP
Oracle load starts as first record is read from
DB2
UNIX Server
Parallel load
Gateway
Total Time 17 min.
Oracle Loader
Oracle
32DataReach Product positioning
- Positioning
- DataReach is a high performance, high
availability data movement solution for
extracting MVS/ESA DB2 data and loading it into
Informix, Oracle or Sybase database on Unix. - A joint development effort of EMC BMC - Not A
Product We Sell Today - What It Does
- Uses EMC Storage to move data at channel speeds
vs network speeds - Moves the work of extracting DB2 MVS data from
MVS to Unix - Advantages
- Moves data 10 to 100 times faster than network
solutions - Completely eliminates mainframe processing
- Completely eliminates network traffic and network
overhead - Allows nearly 100 availability of the source DB2
database - Enables customers to more frequently refresh data
warehouses
33Bulk Data Movement DB2 to OracleThe DataReach
Approach
- DataReach Directly Extracts DB2 Data
- Eliminates network traffic network overhead
- Familiar SQL-based SELECT syntax
- Subset of data via WHERE predicate
- Optional parallel extraction capability
- Optional access via DB2 Index structures
- Data conversion
- EBCDIC to ASCII
- DB2 to generic format
- Direct load of Oracle, Sybase, Informix
- Optional parallel load capability
- Distributed capabilities
MVS Host
DB2
UNIX Host
DB2 Extract
Oracle Loader
Oracle
34DataReach How It Works
MVS System
DB2
Escon Channels
FBA Volumes
CKD Volumes
SYMMETRIX ESP
SCSI Channels
Target DBMS
Native load utility Target RDBMS
Translation Module
Extractor
UNIX
Flat File
35DataReach Performance Benchmark
DB2 to Oracle on HP/UX
36Traditional Process vs DataReach
37DataReach Operational Considerations
- Data Consistency Quiesce DB2
- High Availability Use A mirror copy in
Symmetrix - Security DataReach Authorization Table in DB2
- DB2 Read access
- Unix Login
- Target RDBMS authorizations
-
38Extract, Transform, Move Load OptionsA
Performance Perspective
39High Performance Data Propagation Strategy for
Supporting Data Warehouse
Integration Area
Data Warehouse
Operational Data Store
Change History
Data Warehouse Refresh
Data Mart
Data Mart
Business Intelligence Systems
40High Performance Data Propagation Strategy for
Supporting DW e-Business
Updates
Inquires
App. Server
Integration Area
Data Warehouse
Operational Data Store
Change History
Data Mart
Data Mart
Data Warehouse Refresh
High Performance Data Propagation
Business Intelligence Systems
41High Performance Data Propagation Strategy for
Enterprise Application Integration
Note This is a BMC Services Offering
PeopleSoft
Baan
Oracle
SAP
Messaging
Bulk Message Queue
Data Warehouse
ERP Tools
Change Message Queue
Data Mart
High Performance Data Propagation
e-business Applications
42Major U.S. Brokerage Firm
- Application Integration example
- Global corporation headquartered in New York City
providing - Securities
- Asset Management
- Credit and transaction services
43The Problem
- Business challenge
- Migration to new strategic DBMS could not impact
business operations - Technical challenge
- Keep current ADABAS DBMS synchronized with new
strategic DB2 DBMS - The solution had to be sustainable for the
long-term and also be scalable
44The Solution
- Client already had an ADABAS log capture
mechanism and MQSeries. - A Custom Adapter for Source MQSeries to Change
Data Move - written in ASM
- runs as a started task
- Primarily batch with over 700 files (as sources).
45Major U.S. Bank
- e-Business example
- Provides anytime, anywhere access to products and
services through - Walk up services
- Automated Teller Machines (ATM)
- 24-Hour Phone Banking
- Internet banking
- Offices in 17 Midwestern and Western states
46The Problem
- Business Challenge
- Multiple access methods drive a need to provide a
common method to authenticate an account owner - Technical Challenge
- Account verification information is maintained in
purchased IMS application - Move to leading edge Storage Area Network
technology and required integration.
47The Solution
- Target is not a conventional DBMS but a storage
area network. - High data volumes
- Target data written to MQSeries
48High Performance Data Propagation Facilitating
DBMS Migrations
- Change target DBMS without impacting operational
applications - Move target DB from Sybase to Oracle to SQL
Server to UDB to ??
BMC Apply
EDP Logger
LRP
TNR
OEM Apply
49BMCs Data Propagation is Different?
- Transaction based data propagation supports
applications executing hundreds of
transactions/second - For IMS, Fast Path, CICS VSAM and VSAM Batch
- Does not use IBM capture exits, logs, or require
any additional logging - Automatically transforms non-relational data
structures to relational - Supports Near-Real-Time with minimum latency
for target updates - No requirement for DB2 staging tables and
associated logging - Captures changes from VSAM batch applications
even when no logs are used - For DB2
- No requirement for DB2 staging tables and
associated logging - Transaction consistent propagation
- Supports Near-Real-Time with minimum latency
for target updates - Component of a Complete Enterprise Data Movement
Solution - Common management console - Easy to administer
- Integrated restart/recovery of the propagation
process - Shared data transformations
50Extract, Transform, Move Load OptionsA
Performance Perspective