Title: How Oracle Database 10g Revolutionizes Availability and Enables the Grid
1(No Transcript)
2How Oracle Database 10g Revolutionizes
Availability and Enables the Grid
Session id 40164
- Juan LoaizaVice President, Systems Technologies
- Oracle Corporation
3From High Quality Parts to High Quality Systems
- Traditionally Low Cost Low Quality
- High quality systems were built by combining high
quality, high cost parts Mainframe model - Oracle enables a new model
- Oracle combines high volume inexpensive
processors and storage to produce a high quality
system - Unbreakable Inexpensive Systems
4Low Cost Fault Tolerance
Computer Failures
Grid Clusters Low Cost Fault Tolerance
UnplannedDowntime
Data Failures
System Changes
PlannedDowntime
Data Changes
5Commercial Grids and Availability
- Grid pools standard low cost nodes and modular
disk arrays - Perfect for RAC HA
- Failover can happen to any node on the grid
- Grid load balancing will redistribute load over
time
Designed to Tolerate Failures
6New in 10g for Availability
- Integrated clusterware
- Integrated, less moving parts, better tested
- Avoid operator errors by reducing cross product
coordination - Built in integration with Application Server for
failing over connection pools - Faster failover between nodes
- Single digit seconds
7New Economics forData Protection Recovery
Computer Failures
Disk Based Recovery
UnplannedDowntime
Data Failures
Trade cheap disk space for expensive downtime
System Changes
PlannedDowntime
Data Changes
8New World Disk Based Data Recovery
- Disk economics are close to tape
- Disk is better than tape
- Random access to any data
- We rearchitected our recovery strategy to take
advantage of these economics - Random access allows us to backup and recover
just the changes to the database - Backup and Recovery goes from hours to minutes
1980s - 200 MB
1000x increase
2000s - 200 GB
9Resiliency using Low Cost Storage
Storage Failure
Computer Failures
UnplannedDowntime
Human Error
Data Failures
Corruption
System Changes
PlannedDowntime
Site Failure
Data Changes
Four Failure Types
10Data Mirroring with ASM
- ASM mirrors data across inexpensive modular
storage arrays - No additional logging or expensive NVRAM to
recover mirrors - Database logging recovers mirrors
- Automatically remirrors when disk or array fails
- Designed to tolerate failures
Failure Resiliency using Low Cost Storage
11Collapsing the Cost of Human Error
Storage Failure
Computer Failures
UnplannedDowntime
Human Error
Data Failures
Corruption
System Changes
PlannedDowntime
Site Failure
Data Changes
12Human Error
Single Biggest Cause of Downtime
- Goal is to quickly analyze and repair
- For Localized damage
- Need surgical analysis and repair
- Example deleted wrong order
- For Widespread damage
- Need complete back-out to avoid long downtime
- Example batch job deletes this months orders
13Flashback Time Navigation
- Flashback Query
- Query all data at point in time
- Flashback Versions Query
- See all versions of a row between two times
- See transactions that changed the row
- Flashback Transaction Query
- See all changes made by a transaction
Select from Emp AS OF 200 P.M. where
Tx 3
Select from Emp VERSIONS BETWEEN 200 PM and
300 PM where
Tx 2
Tx 1
Select from DBA_TRANSACTION_QUERY where xid
000200030000002D
14Flashback Database
- A new strategy for point in time recovery
- Flashback Log captures old versions of changed
blocks - Think of it as a continuous backup
- Replay log to restore DB to time
- Restores just changed blocks
- Its fast - recover in minutes, not hours
- Its easy - single command restore
- Flashback Database to 205 PM
Disk Write
New Block Version
Old Block Version
Rewind button for the Database
Data Files
FlashbackLog
15Flashback Error Correction
- Recovery at all levels
- Database Level
- Flashback Database restores the whole database to
time - Uses Flashback Logs
- Table Level
- Flashback Table restores rows in a set of tables
to time - Uses UNDO in database
- Flashback Drop restores a dropped table or a
index - Recycle bin for DROPs
- Row Level
- Restore individual rows
- Uses Flashback Query
Database
Customer
Order
16Flashback for All Users
- END USER
- Flashback Query
- Flashback Versions Query
- DEVELOPER
- Flashback Versions Query
- Flashback Transaction Query
- Flashback Table
- DATABASE ADMIN
- Flashback Database
- Flashback Drop
17Revolution in Recovery
- Flashback Revolutionizes Recovery
- Operates on just the changed data
- Time to correct error equals time to make error
- Minutes instead of hours
- Flashback is Easy
- Single command instead of complex procedure
Correction Time Error Time f(DB_SIZE)
18Prevention Recovery of Corruptions
Storage Failure
Computer Failures
UnplannedDowntime
Human Error
Data Failures
Corruption
System Changes
PlannedDowntime
Site Failure
Data Changes
19Oracle End-to-end Data Validation
Blocks validated and Protection info added to
block
Oracle
- H.A.R.D. Hardware Assisted Resilient Data
- Prevents corruption introduced in IO path between
DB and storage - Initially introduced in Oracle9iR2
- 10g HARD provides
- Better checks
- All file types block sizes checked
- DB, log, archive, backup, etc.
- A.S.M. enables HARD without using RAW devices
- Supported by major storage vendors
A.S.M.
Volume Manager
Operating System
Host Bus Adapter
SAN Virtualization
SAN Interface
Storage Device
Protection info validated by storage device
20Flash Recovery Area
- Fully automatic disk based backup and recovery
- Set and Forget
- Nightly incremental backup rolls forward recovery
area backup - Changed blocks are tracked in production DB
- Full scan is never needed
- Dramatically faster (20x)
- Blocks validated to prevent corruption of backup
copy - Use low cost ATA disk array for recovery area
Flash RecoveryArea
Nightly Apply Validated Incremental
Weekly Archive To Tape
DatabaseArea
Two Independent Disk Systems
21Low Cost No Compromise Disaster Recovery
Storage Failure
Computer Failures
UnplannedDowntime
Human Error
Data Failures
Corruption
System Changes
PlannedDowntime
Site Failure
Data Changes
22Existing Site Recovery Tradeoffs
Production Database
Standby Database
Transaction Shipping
Reporting On DelayedData
4 Hour Delay Apply
- User can delay log apply to protect from user
errors but - Failover takes hours
- Reports run on hours old data
- After failing over to standby, production DB must
be rebuilt - Production has updates that did not get to
standby
23Low Cost No Compromise Disaster Recovery
Reporting On Real TimeData
Standby Database
Transaction Shipping (Real Time Apply)
Production Database
Some Nodes Used for Other Computing
No Delay
Flashback Log
Flashback Log
- Flashback DB removes need to delay apply of logs
to correct errors - Flashback DB removes the need to reinstantiate
primary on failover - Real-time log apply enables real-time reporting
on standby - Data Guard works transparently across GRID
clusters - Standby can use fewer CPU resources than primary
24Highest Data Protection Lowest Cost
Dramatic Advances in Ease of Use
Data Guard Site Failure Protection
Combine the Features to Achieve Any Level of
Data Protection
25Other Protection Enhancements
- Compression of archive logs and backups
- Automated failover to a previous backup when
restore discovers a missing or corrupt backup - Automated recovery through a previous point in
time recovery - recovery through resetlogs - Automated creation of new files during recovery
- Automated channel failover on backup or restore
- Automated tablespace point-in-time recovery
- Full DB begin backup command for faster mirror
split - Improved Recovery Parallelism (2 to 4 X)
- Tablespace Rename
- Proxy (third-party) Backup for archive logs
- Time window based throttling of backups
26Other 10g Data Guard Enhancements
- SQL Apply Enhancements
- Support for Longs
- Support for multi-byte CLOBs and NCLOBs
- Support for Index Organized Tables without
overflow or LOB segments - Instantiation of logical standby with no quiesce
of primary - Generic Data Guard Enhancements
- Data Guard Broker support for RAC
27No Cost System Changes
Goal
Computer Failures
- Allow any change to the system with no downtime
UnplannedDowntime
Data Failures
Online Reconfiguration
System Changes
PlannedDowntime
Rolling Upgrades
Data Changes
28No Cost System Changes Capacity on Demand
- CPU
- Add/remove CPUs on SMP online
- Cluster Nodes
- Add/remove cluster nodes online
- No data movement needed
- Memory
- Grow and shrink shared memory and buffer cache
online - Auto tuning of memory online
- Disk
- Add/remove disks online
- Automatically rebalance
- Move datafiles
29Rolling Patch Upgrade using RAC
Oracle Patch Upgrades
Clients
Clients
Patch
A
B
B
A
B
1
2
Operating System Upgrades
Initial RAC Configuration
Clients on A, Patch B
Patch
A
A
B
Hardware Upgrades
3
4
Upgrade Complete
Clients on B, Patch A
30Rolling Release Upgrade using Data Guard
Logs Ship
Logs Queue
Upgrade
Patch Set Upgrades
Clients
Clients
1
2
Major Release Upgrades
Version X
Version X
Initial SQL Apply Config
Upgrade node B to X1
Logs Ship
Logs Ship
Cluster Software Hardware Upgrades
Clients
Clients
3
4
X1
Run mixed to test
Switch to B, upgrade A
31No Cost Data Changes
Goal
Computer Failures
- Competitive pressures demand continual change
- Need to change data with no interruption to the
application - location, format, indexing, or even definition
UnplannedDowntime
Data Failures
System Changes
PlannedDowntime
Online Redefinition Evolution without Interruption
Data Changes
32Online Redefinition
- All indexing operations can be done online
- Create new index, move index, defragment index
- Tables can be Reorganized Redefined online
- Table contents are copied to a new table
- Defragments and allows changing location, table
type, partitioning - Contents can be transformed as they are copied
- Can change columns, types, sizes - specified
using SQL Select
Transform
Copy Table
Source Table
Result Table
GUI interface to make it Simple
Store Updates
Update Tracking
Continuous Queries Updates
Transform Updates
33Online Redefinition Enhancements
- Enhanced Online Table Redefinition
- Easy cloning of indexes, grants, constraints,
etc. - Convert from long to LOB online
- Allow unique index instead of primary key
- Change tables without recompiling stored
procedures - Stored procedures can depend on the signature of
a table instead of the table itself - Online Segment Shrink
- Return unused space within the blocks of a
segment to the tablespace
34Maximum Availability Architecture (MAA)
- Operational Practices are key
- Technology alone is not enough
- MAA is a blueprint for achieving HA DR
- Tested, validated, and documented best practices
- Database, Storage, Cluster, Network
- 10 person year effort
- otn.oracle.com/deploy/availability
M.A.A. How to Prevent, Tolerate, Recover
From Outages
Maximum Availability Unbreakable Architecture
Best Practices
35Highest Availability at Lowest Cost
- Highest Availability
- Fault Tolerant Clusters
- Flashback Error Correction
- Automated Disk Backup
- No Compromise Disaster Recovery
- Rolling Upgrades
- Online Redefinition
- At Lowest Cost
- Low Cost Grid servers
- Low Cost Modular Storage Arrays
- Automated Simple to Use
Oracle10g is Unbreakable Inexpensive
36Next StepsHigh Availability Sessions from Oracle
Wednesday in Moscone Room 304
Tuesday in Moscone Room 304
- 1100 AM
- How Oracle Database 10g Revolutionizes
Availability and Enables the Grid -
- 330 PM
- Oracle Recovery Manager (RMAN) 10g Reloaded
- 500 PM
- Proven Techniques for Maximizing Availability
- 830 AM
- Oracle Database 10g - RMAN and ATA Storage in
Action - Â
- 1100 AM
- Oracle Data Guard Maximum Data Protection at
Minimum Cost - Â
- 100 PM
- Oracle Database 10g Time Navigation Human-Error
Correction - Â
- 430 PM
- Data Guard SQL Apply Back to the Future
For More Info On Oracle HA Go To
http//otn.oracle.com/deploy/availability/
37Next StepsHigh Availability Sessions from Oracle
Thursday
830 AM in Moscone Room 304 Oracle Database 10g
Data Warehouse Backup and Recovery Automatic,
Simple, Reliable 830 AM in Moscone Room
104 Building RAC Clusters over InfiniBand
For More Info On Oracle HA Go To
http//otn.oracle.com/deploy/availability/
38Q
A
Q U E S T I O N S
A N S W E R S
39New Oracle Database 10g HA Features
- Clusters
- Portable Clusterware
- Cluster file system for Linux Windows
- Automated Patching
- Data Guard SQL Apply
- Support for Longs
- Support for multi-byte CLOBs and NCLOBs
- Support for Index Organized Tables
- Simplified zero data loss failover
- Real time apply allows real time reporting
- Zero downtime instantiation
- Rolling Upgrades
- Rolling Upgrades Using Data Guard SQL Apply
- Online Redefinition
- Support of Unique Indexes
- One Step Cloning of Dependent Objects
- Columns can be Populated Using Sequences
Sysdate - Signature Based Dependency Tracking Using
Synonyms - Online Segment Shrink
- Data Guard Generic
- Data Guard Broker support for RAC
- Named Data Guard Configurations
- Real Time Apply
- Flashback Standby Database
- Flashback Reinstantiation
- Improved Recovery Parallelism
40New Oracle Database 10g HA Features
- Flash Backup Recovery
- Automated Management of BR Disk Space
- Simplified Backup Using Image Copy
- Change Aware Incremental Backups
- Incrementally Updated Backups
- Compressed archive logs
- Tuning
- Improved Recovery Parallelism
- Faster Instance Startup Cache Warm
- Flashback
- Flashback Drop
- Flashback Row History
- Flashback Table
- Flashback Transaction History
- Flashback Database
- Better map of time to SCN for flashback query
- LogMiner
- Automated Specification of Logs to Mine
- Support for Shared Server Configurations
- Fine Grained Supplemental Logging
- Backup Recovery
- Simplified Recovery Through Resetlogs
- Restore Tolerates Missing Backups
- Proxy Backup of Archives
- Automated TSPITR Instantiation
- Full DB Begin Backup
- Automated Backup Channel Failover
- Simplified RMAN cataloging of backup files
- Automated File Creation during Recovery
- Drop Database
- Rename Tablespace
41(No Transcript)