Here is my question after tons of search and test without have good solutions.
Target:
1) I have a 12.1.0.2 enterprise single instance database "testdb" as primary database running on server "node1"
2) I have created physical standby database "stbydb" on server "node2"
3) DataGuard running on MaxAvailability mode (SYNC) with 12c default real-time redo apply.
4) Primary database has 3 single-member redo groups. (/oraredo/testdb/redo01.log redo02.log redo03.log)
5) I have created 4 standby redo logfiles (/oraredo/testdb/stby01.log stby02.log stby03.log stby04.log)
6) I do RMAN backup (database and archivelog) on standby site only.
7) I want to use these backup pieces to do full database restore on primary database.
This is a DR test to simulate the scenario that all Primary & Standby servers total lost.
So here is how I do backup, on standby database:
1) Run "alter database recover managed standby database cancel" for ensure consistent datafiles
2) RMAN> backup database;
3) RMAN> backup archivelog all;
I got backup pieces and copied out to primary db server something like:
/home/oracle/backupset/o1_mf_nnndf_TAG20151002T133329_c0xq099p_.bkp (datafiles)
/home/oracle/backupset/o1_mf_ncsnf_TAG20151002T133329_c0xq0sgz_.bkp (spfile & controlfile)
/home/oracle/backupset/o1_mf_annnn_TAG20151002T133357_c0xq15xf_.bkp (archivelogs)
So here is how I do restore, on primary site:
I have cleanup all folders (datafiles, controlfiles redos all gone).
1) restore spfile to pfile
RMAN> startup nomount
RMAN> restore spfile to pfile '/home/oracle/pfile.txt' from '/home/oracle/backupset/o1_mf_ncsnf_TAG20151002T133329_c0xq0sgz_.bkp';
2) modify pfile to convert to primary db content. pfile shows below
*.audit_file_dest='/opt/oracle/db/admin/testdb/adump'
*.audit_trail='db'
*.compatible='12.1.0.2.0'
*.control_files='/oradata/testdb/control01.ctl','/orafra/testdb/control02.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_file_name_convert='/testdb/','/testdb/'
*.db_name='testdb'
*.db_recovery_file_dest='/orafra'
*.db_recovery_file_dest_size=10737418240
*.db_unique_name='testdb'
*.diagnostic_dest='/opt/oracle/db'
*.fal_server='stbydb'
*.log_archive_config='dg_config=(testdb,stbydb)'
*.log_archive_dest_2='service=stbydb SYNC valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=stbydb'
*.log_archive_dest_state_2='ENABLE'
*.log_file_name_convert='/testdb/','/testdb/'
*.memory_target=1800m
*.open_cursors=300
*.processes=300
*.remote_login_passwordfile='EXCLUSIVE'
*.standby_file_management='AUTO'
*.undo_tablespace='UNDOTBS1'
3) restart db with modified pfile
SQLPLUS> create spfile from pfile='/home/oracle/pfile.txt'
SQLPLUS> shutdown abort
SQLPLUS> startup nomount
4) restore controlfile
RMAN> restore primary controlfile from '/home/oracle/backupset/o1_mf_ncsnf_TAG20151002T133329_c0xq0sgz_.bkp';
RMAN> alter database mount
5) catalog all backup pieces
RMAN> catalog start with '/home/oracle/backupset/'
6) restore and recover database
RMAN> restore database;
RMAN> recover database until scn XXXXXX; (this SCN is the max in the archivelog backups which is greater than datafile backup's scn)
7) open resetlogs
RMAN> alter database open resetlogs;
Everything looks fine except one of the standby redo logfile is not generated
SQL> select * from v$standby_log;
ERROR:
ORA-00308: cannot open archived log '/oraredo/testdb/stby01.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
no rows selected
I was planning use same backup to restore both primary & standby database to save the traffic and down time in between in the real production world.
So I have done exact the same steps (except RESTORE STANDBY CONTROLFILE and no recover after restore database) to restore standby database.
And I got the same missing logfile.
The problem here is:
1) alert.log full filled with this error, this not the concern here
2) now the REAL-TIME redo apply won't work since the standby side LGWR is always showing "WAITING_FOR_LOG"
3) I can't drop and recreate that logfile
Then I tried more and found:
The missing standby logfile was always "ACTIVE" at the moment RMAN backup was taken.
Eg, on standby db, below group#4 (stby01.log) would be lost after restore.
SQL> select GROUP#,SEQUENCE#,USED,STATUS from v$standby_log;
GROUP# SEQUENCE# USED STATUS
---------- ---------- ---------- ----------
4 19 133632 ACTIVE
5 0 0 UNASSIGNED
6 0 0 UNASSIGNED
7 0 0 UNASSIGNED
So before I took backup, I tried on primary database:
SQL> alter system set log_archive_dest_state_2=defer;
This caused the standby side standby_log group#4 being released:
SQL> select GROUP#,SEQUENCE#,USED,STATUS from v$standby_log;
GROUP# SEQUENCE# USED STATUS
---------- ---------- ---------- ----------
4 0 0 UNASSIGNED
5 0 0 UNASSIGNED
6 0 0 UNASSIGNED
7 0 0 UNASSIGNED
Then, the backup taken after was successfully restored without missing standby logfile.
However, change this parameter from primary database means break down the protection of DataGuard when doing backup. This is not accept on production environment.
Finally, my real questions come:
1) Is there anything I can do that don't need do above parameter changing?
2) I know I can re-create control file to remove standby redo and then re-create after. Is there any simple/faster way to avoid the standby logfile lost or to recreate the lost one?
I understand there are many ways to workaround this. Something like keep a copy of the standby redo log file and copy in place the missing one, etc, etc....
And yes I still able to do none real-time apply "using archived logfile" but that is also not accept in production protection mode.
I just want to proof that design (which is showing in a few oracle doc, Doc ID 602299.1 is one of those) that taking backup from standby database effectively works and can be used to restore both site. And it can be done without spend more time to retake backups or put load on primary database to create standby from active database.
Any of your idea is much appreciated.
Thank you!