Title: Jeff Slavitz Independent Consultant Oracle Applications DBA
1Using Oracle Data Guard for Applications Disaster
Recovery and More
Jeff Slavitz Independent Consultant Oracle
Applications DBA and Developer Jeff_at_OracleAppsPro
.Com
2What Would You Do???
- Its Monday morning.
- Fatal production system hardware problem, or
production hardware location is now a smoking
hole - Expect Production to be down for an unforseen
amount of time - It is year-end, you have a new manager and your
senior DBA is on vacation
3- What is your backup situation?
- Do you have a cold backup? On-site or off-site?
- Do you have a hot backup. On disk? On tape? On
another accessible site? How recent is it? - Will you be able to read your tape backups?
- How recent are the archive logs that are now on
the system that is down? - Where are your archive log backups since the last
hot backup?
4Options
- Restore and clone latest backup to a test box.
How familiar is your DBA with RMAN recover? Is
the box big enough to run the company? Will
interfaces work (EDI, Payroll, ) - Rent or buy a box. How long would it take to
find and setup a new box? - How long will it take to get the company up and
running? Have you practiced this process or is
this a fire drill? How many P1s will you need? - How much data will the company lose?
- What does the company do while Oracle is down?
5How Do You Feel?
How Does Your Manager Feel?
6Or
- In 15 minutes youve fully switched Oracle
Applications over to a standby database - Youve tested the procedure with all DBAs
- The failover procedure is fully supported and
documented by Oracle - Interfaces are tested on the standby database
- You run the entire company once a month for a
short period of time on the standby database - You have a test standby database that DBAs
practice on and you use for reporting purposes
7How Do You Feel?
How Does Your Manager Feel?
8My Goal Today
- Tell you how you can be relaxed, not stressed, in
a disaster recovery situation - What is Data Guard
- Describe added value Data Guard and Flashback
provide in addition to disaster recovery - On a high level, review steps to implement Data
Guard and Flashback - Lessons learned - what Metalink notes dont tell
you - Please ask questions as I go along
- Apologies to the non-DBAs
9What is Data Guard?
- Data Guard is part of Oracle, not a separate
product, which provides a set of services that
create, maintain, manage, and monitor one or more
standby databases - Data Guard maintains these standby databases as
transactionally consistent copies of the
production database - Data Guard can switch any standby database to the
production role
10Without Data Guard
11With Data Guard
Oracle Database High Availability Best Practices
10g Release 2 (10.2)
12Why Not Use Data Guard?
- Too expensive not! You already have and
license it if you have Oracle Enterprise Edition.
Nothing to purchase in order to implement what we
discuss today. - Too bleeding edge. Data Guard has been around
since Oracle 9i and Oracle 8i as Standby Server. - Too difficult to implement. Well documented by
Oracle and completely supported. This
presentation is a roadmap through the process
plus my own observations and experiences. - No DR box. Use test box for standby. Doesnt
protect against smoking hole but its better than
nothing. - How often do you really need DR? Is once enough?
Plus a standby database can be used for other
purposes.
13Why You Should Use Data Guard
- If your Production Oracle Applications instance
went away would anybody notice? - Our job is to protect the companys data
- Your manager will thank you DR is important.
- Data Guard plus Flashback provide added value
- Instant clone
- Reporting instance
- Many exciting possibilities!
14What Is Flashback?
- Introduced in Oracle 9i
- Flashback is part of your Oracle database no
licensing requirement - Provides a rewind button for your data
- Flashback Database - return database to a
previous point in time - Flashback Table - return a table to a previous
point in time - Flashback Query see data at a previous point in
time
15Data Guard and Flashback Together
- Many possibilities when used together
- Scenario
- Show stopper problem in Production
- Need to test a patch or data fix with current
production data on test system ASAP - Old school Clone
- Time to complete ??? Hours
- Data Guard Use standby database as an instant
clone - Time to complete 10 minutes
16Instant Clone
- Activate standby database
- Perform testing on standby database
- When testing complete, flashback standby database
to activation point in time - Test again? Allows destructive testing.
- When all done, flashback and re-activate standby
database
17Reporting Database
- Scenario
- You want a data warehouse which is a snapshot of
Production as of 9pm the day before. Users need
access via Oracle Applications and Discoverer - Old School Daily clone
- Data Guard Use standby database as a reporting
database
18Reporting Database
- Activate standby database
- Users login to standby database with Oracle
Applications, Discoverer, or any other tool.
Users can make changes in Application though they
wont be saved after Flashback. - Archive logs accumulate on standby database
- At 9pm each day flashback database to the way it
was at 9pm the day before, apply archive logs
created since then, reopen database - Can have multiple standby databases one for DR,
one for reporting, one for ??? - Oracle 11i Active Data Guard allows read-only
access to standby database while it is being
updated from primary
19Bumps Along the Way
- Time implementation so its not at year-end
- Bumpy transition into production low Test
database activity is not an accurate picture of
how Data Guard and Flashback will work in
Production - 3mb network was not enough for standby to keep up
with primary - needed 10mb just for Data Guard - Do periodic full company test of standby system
- Data Guard and Flashback implementation are well
documented but there is a learning curve
20Implementing Data Guard and Flashback
21Hardware Configuration
Primary (California)
Standby (Nevada)
Database
Database
45mb of which 8-10mb used by Data Guard
Application
Application
22Software Configuration
- Database
- 10.2.0.4
- Data Guard Physical Bundle patch 7936993
- Data Guard Logical Bundle patch 7937113
- Data Guard Broker Bundle patch 7936793
- Applications 11.5.10.2
23Data Guard Setup
- Two Metalink notes
- Note 216212.1 Business Continuity for Oracle
Applications Release 11i, Database Releases 9i
and 10g - Note 452056.1 Business Continuity for Oracle
Applications Release 12 on Database Release 10gR2
(earned my vote for best Metalink note ever) - Use Release 12 note
- Better configuration, simpler and works fine with
11i - Substitute APPL_TOP for Instance Top in text
24Logical vs Physical
- Two types of standby databases
- Physical
- Byte-for-byte duplicate of the primary database
- Archived redo logs transferred from primary
database are directly applied to the standby
database - When standby in recovery mode it is mounted not
open - Logical
- Different structure than the primary database
- Oracle uses SQL statements to update standby
- Standby can be open while it is being updated
- Today we are talking about Physical standby
databases
25Maximum Protection Mode
- Three types of protection Maximum Protection is
highest level of data security - Primary database changes are not committed until
it has been confirmed that the data is available
on at least one standby database - If Oracle determines that the redo data cannot be
transferred from the primary server to the
standby server, it will automatically stop the
primary database instance - Data transmitted synchronously - network
implications
26Maximum Performance Mode
- Default protection mode
- The commit operation on the primary database is
not contingent upon the data being received by
the standby server - If all of the standby servers become unavailable,
processing will continue on the primary database - Data transmitted asynchronously
- This performance mode is what are discussing today
27Maximum Availability Mode
- Second highest level of data security
- Primary database changes are not committed until
it has been confirmed that the data is available
on at least one standby database - If the standby database becomes unavailable for
any reason, the protection mode is temporarily
lowered to maximum performance until the problem
has been corrected
28Data Guard Installation Overview
- Develop a naming convention for your primary and
standby database servers e.g. PROD_ltprimary
servergt, PROD_ltstandby servergt - Create standby redo logs one more than number
of redo logs, same size as redo logs, do not
multiplex - Clone database to standby
- Copy datafiles to standby with closed database or
with open database using RMAN (see note 753241.1,
Configuring Standby Database on R12 using RMAN
Hot Backup) - Test primary and standby and listeners using
tnsping - Clone application to standby
- You are now ready to turn on Data Guard
29Primary init.ora ifile Parameters
- Global unique name
- db_unique_namePROD_CAMELDB1
- Flash recovery area. This is the default
location for control files, online - redo logs, archived redo logs, flashback logs,
RMAN backups. - db_recovery_file_dest/u01/oracle/flash_recovery_a
rea - First destination for archived redo logs
- log_archive_dest_1
- 'LOCATIONUSE_DB_RECOVERY_FILE_DEST
MANDATORY - Second destination for archived redo logs
- log_archive_dest_2 'SERVICEPROD_LNXDRDB1
- valid_for(online_logfiles,primary_role)
- db_unique_namePROD_LNXDRDB1 LGWR
ASYNC20480 - OPTIONAL REOPEN15 NET_TIMEOUT30
30Primary init.ora ifile Parameters
- Limit on the total space available to the flash
recovery area. - db_recovery_file_dest_size100g
- In minutes, how long you want to keep flashback
files. Since using - guaranteed flashback really don't need this to
be too big. - db_flashback_retention_target120
- On/Off switch for sending logs to standby
system - log_archive_dest_state_2 defer
- Databases in Data Guard configuration using my
instance naming convention - log_archive_config'dg_config(PROD_CAMELDB1,PROD_
LNXDRDB1) - Log gap detection and resolution when this
database is the standby - fal_server 'PROD_LNXDRDB1
- fal_client 'PROD_CAMELDB1'
31Start Data Guard
- Start primary and second database and listeners
- Primary startup
- Standby startup mount
- Turn on archive redo log transport on primary
- alter system set log_archive_dest_state_2enable
- Put standby in recovery mode
- Recover managed standby database using current
logfile disconnect - Confirm logs are shipping to standby
- Primary alter system archive log current
- Standby Check archive directory or query
varchive_log
32Care and Feeding of Data Guard
- Check alert log of primary and standby on a
regular basis - Set Enterprise Manager to email you when an error
shows up in alert log - Automatic archive log gap resolution usually
works but network connectivity issue can require
you to manually fix - Primary alert log will notify you when a gap
exists - FALclient Failed to request gap sequence
- GAP - thread 1 sequence 435-442
- Copy missing archive logs from primary to standby
- On standby alter database register logfile
33Flashback
- db_recovery_file_dest
- defines Flash recovery area disk location
- This is the default location for control files,
online redo logs, archived redo logs, flashback
logs, RMAN backups - Archive logs are placed in flashback area in
sub-directory with date - DO NOT delete files in flashback area manually
database calculates space available based on what
RMAN has done - db_recovery_file_dest_size
- Specifies limit on the total space to be used by
the flash recovery area - Check space usage using vflash_recovery_area_usag
e - When recovery area is full database stops!
- db_flashback_retention_target
- Defines in minutes, how long you want to keep
flashback files - Since using guaranteed flashback don't need this
to be too big - To turn flashback on mount database and alter
database flashback on
34Switchover Database Steps
- Seamless switch from primary to standby
- Switch existing primary to new standby
- Shutdown application
- Confirm all logs received and applied on standby
- alter database commit to switchover to standby
with session shutdown - Recover managed standby database using current
logfile disconnect - Switch existing standby to new primary
- alter database commit to switchover to primary
- alter database open
- alter system set log_archive_dest_state_2enable
- Confirm logs are shipping to standby by checking
alert log, archive directory or query
varchive_log
35Switchover Application Steps
- Some of these steps are not in Metalink note but
are required - Clear application context exec
fnd_conc_clone.setup_clean - Run adautocfg on database and application tiers
- Run cmclean.sql (note 134007.1) to avoid Output
Post Processing manager not starting - Update fnd_concurrent_requests
- Change logfile_name and outfile_name to new
APPLCSF path - Change logfile_node_name and outfile_node_name to
new host name - Update fnd_conc_req_outputs (for XML reports)
- Change file_name to new APPLCSF patch
- Change file_node_name to new host name
36Switchover Application Steps
- Update wf_notification_attributes.text_value with
new node name - Update wf_item_attribute_values.text value with
new node name - Check for any other profile values that contain
old node name - Clear Apache cache
- Start application and send users new login url
- Synchronize APPLCSF log and out files
- Use rsync
- Faster than scp and only copies changed files
- rsync -avz
- APPLCSF/out/TWO_TASK/ - applprod_at_lnxdrapp1
APPLCSF/out/TWO_TASK - Consider setting cron job to synchornize APPLCSF
in case of system failure
37Failover
- Immediate switch of primary to standby system
- Some archive log data may be lost due to network
latency - On standby-soon-to-be-primary
- Cancel database recovery
- alter database commit to switchover to primary
- Configure application as shown in Switchover
- Even with cron job synchronizing APPLCSF log and
out files you probably wont get all report
output - Repair former primary system
38Failover Sync Old Primary as New Standby
- Establish new standby by copying primary
datafiles using RMAN, cold backup or flashback.
To use flashback - Find the SCN when the existing primary database
became the primary - select to_char(standby_became_primary_scn) from
vdatabase - On new standby
- Startup mount
- flashback database to scn ltabove gt
- alter database convert to physical standby
- recover managed standby database using current
logfile disconnect
39Instant Clone or Reporting Database
- On standby
- Cancel standby recovery
- create restore point STBY_ON guarantee flashback
database - Active and open database
- Configure application as in Switchover
- Users can now login to the standby database
through Oracle Applications, Discoverer or
anything. - Apply patches, test, update data in standby
database. - To revert to restore point STBY_ON and do further
testing as needed - Shutdown immediate
- Startup mount
- Flashback database to guaranteed restore point
STBY_ON - Alter database open
40Resume Standby Operation
- On Standby
- Shutdown application
- Shutdown database
- Startup mount
- Flashback database to restore point STBY_ON
- Alter database convert to physical standby
41Lessons Learned
- Script as much as possible you will make typos
under time pressure - Use Enterprise Manager to see how Data Guard
affects your system and tune accordingly - Implement Data Guard and flashback at different
times - Setup one test database using Data Guard for
testing and training
42For the Future
- Try Maximum Availability protection
- Use Enterprise Manager to manage Data Guard.
This requires Data Guard Broker implementation
which requires use of spfile. - Completely script switchover and failover as
shown in note 452056.1 - Use RMAN to backup standby database only
- Upgrade database to 11G and experiment with
Active Data Guard. This allows you to create a
read-only database that is open and is constantly
updated from primary.
43For More Information
- Google (it knows everything)
- http//blogs.oracle.com/stevenchan
- Note 452056.1 Business Continuity for Oracle
Apps Release 12 on Database 10GR2 (works fine for
11i) - Note 216212.1 Business Continuity for Oracle
Apps Release 11i, Database Releases 9i and 10G - Note 753241.1 Configuring Standby Database on
R12 using RMAN Hot Backup (works fine for 11i) - Note 805438.1 How to Open Physical Standby For
Read Write Testing and Flashback - Data Guard Redo Transport Network Best
Practices Oracle 10GR2 White Paper - Oracle Data Guard Concepts and Administration
manual - High Availability Best Practices manual
- Backup and Recovery Reference manual
44- Questions?
- Want the latest version of this presentation?
- www.OracleAppsPro.com
- Jeff Slavitz
- Jeff_at_OracleAppsPro.com
- (415) 388 - 3003