Title: Oracle Data Guard
1Oracle Data Guard
2Causes of Data Loss
Hardware system errors
Source Disaster Recovery Journal
3Real Application ClustersContinuous Availability
for all Applications
System Failures
Data GuardGuaranteed Zero Data Loss
SiteFailures
Unplanned Downtime
FlashbackGuaranteed Zero Data Loss
HumanError
ASM MirroringStorage Failure Protection
Storage/NetFailures
Dynamic ReconfigurationCapacity on Demand
without Interruption
Online RedefinitionAdapt to Change Online
4What is a Standby Database ?
- A copy of a production database that you can use
for disaster protection. You can update the
standby database with redo logs from the
production database in order to keep it current.
If a disaster destroys the production database,
you can activate the standby database and make it
the new production database. - You can maintain the standby data in one of the
following modes - For physical standby databases
- Redo Apply
- For logical standby databases
- SQL Apply
- A Standby Database is NOT Data Guard
5Why Data Guard?
- Data Guard helps you protect your Data.
- Takes your data and automatically puts it
elsewhere - Makes it available for Failover in case of
failure. - The apply process also revalidates the log
records to prevent application of any log
corruptions - Geographically dispersed sites
- Useful for logical data corruptions if lag behind
used - Flexible configuration options for protection
level - Reporting and backups can be diverted to standby
- Automatic resync for failed primary
- Switchover for Maintenance
6 Traditional Physical Standby
DatabasesInvestment in Disaster Recovery
7Active Data Guard 11g Investment in Improved
Quality of Service
8Requirements
- Data Guard 11g has several options for deploying
different CPU architectures, O.S. binaries and
Oracle database binaries, on primary and standby
systems. - For example, the primary database may be on
Windows, and the standby database may be on
Linux. - See MetaLink Note 413484.1 for latest
capabilities and restrictions
9Bandwidth Requirements
- Depends on Redo generation
- Find peak redo in AWR report
- Load Profile Per Second Per Transaction
- ---------------
--------------- - Redo size 51,944.64 5,177.09
- Bandwidth in MBPS
- (redo bytes per sec /0.7)8)/1,000,000
10Physical Standby
- Protection Modes
- Physical Standby Architecture
- Standby Redo Logs
- Real Time Apply
- Automatic Resynchronization
11Database Protection Modes
- Maximum Protection
- No Data Loss and No data divergence
- Arch_dest mandatory, lgwr, sync, affirm
- Primary db shutdown when unable to access stdby
- Maximum Availability
- Arch_dest mandatory, lgwr, sync, affirm
- Protection auto lowered when stdby is unavailable
- Maximum Performance
- Arch_dest lgwr/arch, sync/async,
mandatory/optional - Minimal performance impact
12Maximum Availability Mode
Protection Mode
Failure Protection
Redo Shipping
Maximum Availability Zero Data Loss
Protects Against Primary Failure
LGWR using SYNC
- Zero Data Loss as long as the network stays up!
- Enforces protection of every transaction
- Configuration LGWR SYNC
- If last standby is unavailable, processing
continues at primary - When the standby becomes available again,
synchronization with the primary is automatic
ALTER DATABASE SET STANDBY TO MAXIMIZE
AVAILABILITY
13Architecture
Primary database transactions
Standby database
MRP or LSP
(MRP only)
LGWR
RFS
Online redo logs
Oracle net
Standby redo logs
Backup
Reports
FAL
ARC0
ARC0
Archived redo logs
Archived redo logs
14Standby Redo Logs
15Real Time Apply
- Redo data is applied to the standby database as
soon as it is received from the primary database - In Oracle9i Data Guard this apply has to wait
till an archivelog is created on the standby
database - For Redo Apply
- ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
USING CURRENT LOGFILE - When real time apply is enabled, RECOVERY_MODE
column in VARCHIVE_DEST_STATUS displays MANAGED
REAL TIME APPLY
16- SQLgt ALTER SYSTEM SET LOG_ARCHIVE_DEST_2'SERVICE
tmstby - 2gt OPTIONAL LGWR SYNC AFFIRM
- 3gt VALID_FOR(ONLINE_LOGFILES,PRIMARY_ROLE)
- 4gt DB_UNIQUE_NAMEtmstby'
- SQLgt ALTER SYSTEM SET LOG_ARCHIVE_CONFIG'DG_CONFI
G(tmtst,tmstby)' - SQLgt ALTER DATABASE SET STANDBY DATABASE TO
MAXIMIZE AVAILABILITY - SQLgt SELECT PROTECTION_MODE, PROTECTION_LEVEL
FROM VDATABASE - PROTECTION_MODE PROTECTION_LEVEL
- --------------------- ---------------------
- MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY
17Real-Time Apply Architecture
18Real Time ApplyBenefits
- Standby databases now more closely synchronized
with the primary - More up-to-date, real-time reporting
- Faster switchover and failover times
- Reduces planned and unplanned downtime
- Better Recovery Time Objective (RTO) for DR
19Real Time Apply -Tuning Media Recovery
- Monitor via
- Data Dictionary, OEM, Standby Statspack in 11G
- Big performance boost in Oracle 11G
- Up to 100 increase in redo apply performance
- New standby statspack in Oracle 11G
- See MetaLinkNote 454848.1
- Includes information specific to a standby
- Output from VRECOVERY_PROGRESS
- Output from VMANAGED_STANDBY
20Data Dictionary
- VDATABASE
- DATABASE_ROLE LOGICAL STANDBY, PHYSICAL STANDBY
or PRIMARY - PROTECTION_LEVEL current protection mode
setting. - FS_FAILOVER_STATUS synchronization status
- VDATAGUARD_STATS
- VDATAGUARD_STATUS
- VLOG VSTANDBY_LOG Redo log changed.
- VMANAGED_STANDBY Recovery progress
21Determining Query Latency
- From Primary (requires database link)
- select scn_to_timestamp(
- (select current_scn from vdatabase)
- )-scn_to_timestamp(
- (select current_scn from vdatabase_at_dg)
- ) from dual
- If you do not wish to connect to the Primary
-determine the value for APPLY LAG for a best
estimate - Use Enterprise Manager monitoring
- Query VDATAGUARD_STATS
- select value,unit,time_computed from
vdataguard_stats where name'apply lag'
22Automatic Resynchronization
- Network connectivity problems may occur
- Data Guard automatically resynchronizes standbys
after network connectivity restored - Implicit
- ARCH process idling away on the primary pings
all standbys on a regular basis to see if they
are missing any redo data - If so it sends them the missing redo data
- Explicit
- Gap discovered during apply process in physical
standby - Based on FAL_SERVER and FAL_CLIENT settings,
primary notified, and it sends missing redo data
23Data Guard Role Transitions
- Switchover
- Planned role reversal
- Used for OS or hardware maintenance
- Failover
- Unplanned role reversal
- Use in emergency
- Zero or minimal data loss depending on choice of
data protection mode - Different steps for Physical and Logical Standby
- Switchover using Enterprise Manager is literally
two mouse clicks - Well do a Physical Standby Failover via the
command line using the Broker
24Data Guard Broker
- Fast-Start Failover
- Fast-Start Failover Demo
- Client Failover
- Oracle 11G Active Data Guard
25Fast-Start failover
- Makes Data Guard more than a Standby Database.
Enables automatic failover with no data loss. - A feature of Oracle Database Enterprise Edition.
- Only supports up to Maximum Availability Mode.
- Requires 3rd server. Install DGMGRL client part
of Oracle client administrator software. Observer
process continuously monitors primary and standby
databases. - If the listener is not running on port 1521,
local_listener must be set in the spfile. - Observer detects failure.
- Observer automatically executes database failover
once threshold has been exceeded. - DB_ROLE_CHANGE trigger fires enables primary
service. This trigger can be customized to
restart JDBC mid-tier clients and calls any other
OCI enabled application.
26Fast-Start Failover
- 1. Data Guard in steady state transmitting redo
- 2. Observer monitoring state of the configuration
27Fast-Start Failover
- 3. Disaster strikes the primary connections lost
28Fast-Start Failover
- 4. Observer times out
- 5. Observer validates connection with target
standby - 6. Observer begins Fast-Start Failover
29Fast-Start Failover
- 7. Target standby automatically becomes new
primary (DB_ROLE_CHANGE trigger fires)
30Fast-Start Failover
- 8. After old primary is repaired, Observer
re-establishes connection - 9. Observer automatically reinstates old primary
to be a new standby - 10. Redo transmission starts from new primary to
new standby
31Events that trigger Fast-Start Failover
- Database conditions
- Server crash or shutdown (without db shutdown)
- Database instance failure (or last instance
failure in a RAC configuration) - Shutdown abort (or shutdown abort of the last
instance in a RAC configuration) - Datafiles taken offline due to I/O errors
- Network conditions
- When both the Observer and the standby database
lose their network connection to the primary
database, and when the standby database confirms
that it is in a synchronized state.
32Fast-Start Failover Conclusion
- Fast
- Site failover time measured in seconds, not
minutes - Failover is automatic, no manual intervention
- Reliable
- Eliminates human error
- Zero data loss failover
- Simple
- Automatically determines if failover criteria is
met - Original primary database is automatically
reinstated as a new standby database following
failover
33Fast-Start Failover Conclusion
- Prevention of "Split Brain" due to accidental
startup of former primary database - Reduced downtime through automatic activation of
the standby database - A failover solution without a shared disk system
- with additional advantages (enhanced data
availibity) - and even reduced failover time compared to HA
cluster - Many technical prerequisites (Flashback database,
special Maximum Availability Mode) - No automatic failover to a second standby
database possible
34Fast-Start failover
- Requirements
- Fast-Start Failover is a feature of Oracle Data
Guard, and can't run without a Data Guard Broker
configuration! - Observer machine and configuration
- Special entry in Data Guard Broker configuration
- Maximum Availability Mode (mandatory)
- but special startup behaviour
- but primary stalls in certain situations
- Flashback database must be activated
35Demo Switchover
- 1. Configure Broker and Fast_Start Failover
- 2. Configure Observer
- 3. Shutdown abort on the primary database TMTST
- 4. Wait until Fast_Start occurs on TMSTBY
- 5. Restart the old primary TMTST
- 6. Verify that observer reinstates database TMTST
36Demo Configure Fast_Start Failover
- Flash-Recovery areas are setup on both sides
- SQLgt show parameter DB_RECOVERY_FILE_DEST
- NAME TYPE VALUE
- db_recovery_file_dest string
/tst/dump/oracle/fra - db_recovery_file_dest_size big integer 2G
- Setup Flashback Database (on both)
- SQLgt select FLASHBACK_ON from vdatabase
- FLASHBACK_ON
- NO
- SQLgt SHUTDOWN IMMEDIATE
- SQLgt STARTUP MOUNT
- SQLgt ALTER DATABASE FLASHBACK ON
- SQLgt ALTER DATABASE OPEN
- SQLgt show parameter flash
- NAME TYPE VALUE
- db_flashback_retention_target integer 1440
(1 day) - Broker Parameters
37Demo Configure Fast_Start Failover
- Listener.ora on Primary
- SID_LIST_LSNR_DGTEST
- (SID_LIST
- (SID_DESC
- (GLOBAL_DBNAME tmtst.vodacom.co.za)
- (ORACLE_HOME /tst/opt/apps/oracle/database
/10.2.0.4) - (SID_NAME tmtst)
- )
- (SID_DESC
- (GLOBAL_DBNAME tmtst_DGMGRL.vodacom.co.za)
- (ORACLE_HOME /tst/opt/apps/oracle/database
/10.2.0.4) - (SID_NAME tmtst)
- )
- )
38Demo Configure Fast_Start Failover
- Tnsnames on both
- TMTST.VODACOM.CO.ZA
- (DESCRIPTION
- (ADDRESS_LIST
- (ADDRESS (PROTOCOL TCP)(HOST
prab03.vodacom.co.za)(PORT 1521)) - )
- (CONNECT_DATA
- (SERVICE_NAME tmtst_DGMGRL.vodacom.co.za)
- )
- )
- TMSTBY.VODACOM.CO.ZA
- (DESCRIPTION
- (ADDRESS_LIST
- (ADDRESS (PROTOCOL TCP)(HOST
tvbs01.vodacom.co.za)(PORT 1521)) - )
- (CONNECT_DATA
- (SERVICE_NAME tmstby_DGMGRL.vodacom.co.za)
- )
- )
39Demo Configure Observer
- gt dgmgrl
- DGMGRLgt connect sys/xxx_at_tmtst
- Connected.
- DGMGRLgt CREATE CONFIGURATION TMDRTEST AS
- gt PRIMARY DATABASE IS tmtst -gt SHOW PARAMETER
DB_UNIQUE_NAME - gt CONNECT IDENTIFIER IS tmtst -gt tns entry
- Configuration "tmdrtest" created with primary
database "tmtst" - DGMGRLgt ADD DATABASE tmstby AS
- gt CONNECT IDENTIFIER IS tmstby
- gt MAINTAINED AS PHYSICAL
- Database "tmstby" added
- DGMGRLgt ENABLE CONFIGURATION
- DGMGRLgt SHOW CONFIGURATION
- DGMGRLgt SHOW DATABASE VERBOSE tmtst
- DGMGRLgt EDIT DATABASE tmtst SET PROPERTY
'LogXptMode''SYNC' - DGMGRLgt EDIT DATABASE tmtst SET PROPERTY
FastStartFailoverTarget'tmstby' - DGMGRLgt EDIT DATABASE tmstby SET PROPERTY
FastStartFailoverTarget'tmtst' - DGMGRLgt ENABLE FAST_START FAILOVER
- DGMGRLgt START OBSERVER --gt warning, prompt will
not be returned!
40Demo Switchover
- DGMGRLgt SWITCHOVER TO tmstby ------? duration
90 seconds! - Performing switchover NOW, please wait...
- Operation requires shutdown of instance "tmtst"
on database "tmtst" - Shutting down instance "tmtst"...
- ORA-01109 database not open
- Database dismounted.
- ORACLE instance shut down.
- Operation requires shutdown of instance "tmstby"
on database "tmstby" - Shutting down instance "tmstby"...
- ORA-01109 database not open
- Database dismounted.
- ORACLE instance shut down.
- Operation requires startup of instance "tmtst" on
database "tmtst" - Starting instance "tmtst"...
- ORACLE instance started.
- Database mounted.
- Operation requires startup of instance "tmstby"
on database "tmstby" - Starting instance "tmstby"...
- ORACLE instance started.
41Verify on both
- SELECT DATABASE_ROLE,STATUS,DB_UNIQUE_NAME,
PROTECTION_MODE, - PROTECTION_LEVEL, SWITCHOVER_STATUS,
checkpoint_change, current_scn
,STANDBY_BECAME_PRIMARY_SCN, - FS_FAILOVER_STATUS, FS_FAILOVER_CURRENT_TARGET,
FS_FAILOVER_THRESHOLD, FS_FAILOVER_OBSERVER_PRESEN
T, FS_FAILOVER_OBSERVER_HOST - FROM VDATABASE
- /
42Client Failover Best Practices
- SQLgt exec DBMS_SERVICE.CREATE_SERVICE (
- service_name gt 'tmOCI.vodacom.co.za',
- network_name gt 'tmOCI.vodacom.co.za ',
- aq_ha_notifications gt true,
- failover_method gt 'BASIC',
- failover_type gt 'SELECT',
- failover_retries gt 180,
- failover_delay gt 1)
- SQLgt exec DBMS_SERVICE.START_SERVICE('tmOCI.vodaco
m.co.za') - SQLgt select value from vparameter where name
'service_names' - VALUE
- ------------------------------------------------
- tmtst_DGMGRL.vodacom.co.za, tmOCI.vodacom.co.za
43Client Failover Best Practices
- Configure startup trigger for service
- SQLgt CREATE OR REPLACE TRIGGER manage_OCIservice
- after startup on database
- DECLARE
- role VARCHAR(30)
- BEGIN
- SELECT DATABASE_ROLE INTO role FROM
VDATABASE - IF role 'PRIMARY' THEN
- DBMS_SERVICE.START_SERVICE('tmOCI.vodacom.co.z
a') - ELSE
- DBMS_SERVICE.STOP_SERVICE('tmOCI.vodacom.co.za
') - END IF
- END
44Client Failover Best Practices
- Client tns entry Configuration
- TMOCI(DESCRIPTION
- (ADDRESS_LIST
- (ADDRESS (PROTOCOL TCP)
- (HOST prab03.vodacom.co.za)
- (PORT 1521))
- (ADDRESS (PROTOCOL TCP)
- (HOST tvbs01.vodacom.co.za)
- (PORT 1521))
- (LOAD_BALANCE yes))
- (CONNECT_DATA SERVICE_NAMEtmOCI))
- )
45Active Data Guard 11g Investment in Improved
Quality of Service
46Active Data Guard
- Begin with a Data Guard 11g physical standby
database - If redo apply is running, stop redo apply
- Open the standby database read-only
- Start redo apply
47Data Guard Broker Enterprise Manager
- Data Guard Broker CLI
- Stop redo apply with the following command
- EDIT DATABASE TMSTBY' SET STATEAPPLY-OFF
- Open standby read-only via SQLPlus
- SQLgt alter database open read only
- Restart redo apply via broker CLI
- EDIT DATABASE TMSTBY' SET STATEAPPLY-ON
- Oracle Enterprise Manager 10g
- Stop redo apply within Data Guard GUI
- Open standby in read-only mode in Advanced
Startup Options - Restart redo apply within Data Guard GUI
48Supported Operations for Read Only
- When connected to an Active Data Guard standby
database, read-only applications can perform/use - Selects
- Alter session / system
- Set role
- Lock table
- Call stored procedures
- DBlinks to write to remote databases
- Stored procedures to call remote procedures via
DBlinks - SET TRANSACTION READ ONLY for transaction level
read consistency - Complex queries e.g. grouping set queries and
with clause queries
49THANK YOU
- Thinus.Meyer_at_vodacom.co.za
- http//martinmeyer.blogspot.com