Hi,
I need tips or suggestion if there is a way to parallelize validation of a large backupset with multiple of backup pieces. I notice on one of our largest DB. Backup is taking a long to complete, backup it self completes in 5 hours but the restore database validate completes in almost twice the time. The issue not a problem right now and the previous DBA has accepted those performance since the DB is multi TB with several Tablespace more than 100GB, but I am not comfortable and would like to see if it can still be improved.
I have a backupset with more than 30 bakcup piece, when I check status of the validation using v$session_longops and v$backup_async_io, I do not see multiple sessions working on that backup set I only see rman is working on the validation of that backupset serially(per backup piece). The text below is from a test Database on a vm, with smaller tablespaces. I do not want to do some changes on the performance of our current live DB so I did test on my VM on on my desktop. So the output below is not from the Live DB.
This is the details of my backupset for tablespace test_tbs, I forced a filperset=1 and maxpiece size to 100 so I can force test_tbs to create 2 backuppiece(test_tbs is around 180M) I also disabled optimization.
RMAN> show all;
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 60 DAYS;
CONFIGURE BACKUP OPTIMIZATION OFF;
CONFIGURE DEFAULT DEVICE TYPE TO DISK;
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/u05/TEST/control/control_TEST_%F';
CONFIGURE DEVICE TYPE DISK BACKUP TYPE TO COMPRESSED BACKUPSET PARALLELISM 6;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1;
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1;
CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/u05/TEST/arc/arc_TEST_%Y%M%D_df_%s_%U' MAXPIECESIZE 100 M;
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u05/TEST/control/control_snapcf_TEST.dbf';
list backup of tablespace TEST_TBS;
BS Key Type LV Size | Device Type Elapsed Time Completion Time |
------- ---- -- ---------- ----------- ------------ ---------------
388 | Incr 0 178.56M | DISK | 00:00:47 | 09-MAR-17 |
List of Datafiles in backup set 388
File LV Type Ckp SCN | Ckp Time Name |
---- -- ---- ---------- --------- ----
4 | 0 Incr 14255660568868 09-MAR-17 /u02/oradata/TEST/TEST_TBS_01.DBF |
Backup Set Copy #1 of backup set 388
Device Type Elapsed Time Completion Time Compressed Tag
----------- ------------ --------------- ---------- ---
DISK | 00:00:47 | 09-MAR-17 | YES | TAG20170309T081146 |
| List of Backup Pieces for backup set 388 Copy #1 |
| BP Key Pc# Status | Piece Name |
| ------- --- ----------- ---------- |
| 389 | 1 AVAILABLE /u05/TEST/full/full_TEST_20170309_df_321_a1rumci2_1_1 |
| 390 | 2 AVAILABLE /u05/TEST/full/full_TEST_20170309_df_321_a1rumci2_2_1 |
BS Key Type LV Size | Device Type Elapsed Time Completion Time |
------- ---- -- ---------- ----------- ------------ ---------------
466 | Incr 0 178.55M | DISK | 00:00:47 | 09-MAR-17 |
List of Datafiles in backup set 466
File LV Type Ckp SCN | Ckp Time Name |
---- -- ---- ---------- --------- ----
4 | 0 Incr 14255660598721 09-MAR-17 /u02/oradata/TEST/TEST_TBS_01.DBF |
Backup Set Copy #1 of backup set 466
Device Type Elapsed Time Completion Time Compressed Tag
----------- ------------ --------------- ---------- ---
DISK | 00:00:47 | 09-MAR-17 | YES | TAG20170309T084945 |
| List of Backup Pieces for backup set 466 Copy #1 |
| BP Key Pc# Status | Piece Name |
| ------- --- ----------- ---------- |
| 469 | 1 AVAILABLE /u05/TEST/full/full_TEST_20170309_df_399_cfrumep9_1_1 |
| 470 | 2 AVAILABLE /u05/TEST/full/full_TEST_20170309_df_399_cfrumep9_2_1 |
BS Key Type LV Size | Device Type Elapsed Time Completion Time |
------- ---- -- ---------- ----------- ------------ ---------------
529 | Incr 0 178.55M | DISK | 00:00:46 | 09-MAR-17 |
List of Datafiles in backup set 529
File LV Type Ckp SCN | Ckp Time Name |
---- -- ---- ---------- --------- ----
4 | 0 Incr 14255660599189 09-MAR-17 /u02/oradata/TEST/TEST_TBS_01.DBF |
Backup Set Copy #1 of backup set 529
Device Type Elapsed Time Completion Time Compressed Tag
----------- ------------ --------------- ---------- ---
DISK | 00:00:46 | 09-MAR-17 | YES | TAG20170309T085231 |
| List of Backup Pieces for backup set 529 Copy #1 |
| BP Key Pc# Status | Piece Name |
| ------- --- ----------- ---------- |
| 534 | 1 AVAILABLE /u05/TEST/full/full_TEST_20170309_df_477_etrumeug_1_1 |
| 535 | 2 AVAILABLE /u05/TEST/full/full_TEST_20170309_df_477_etrumeug_2_1 |
RMAN>
However, when I checked the logs and queried v$session_longops and v$backup_async_io I I do not see any parallelism even though parallelism is already set to 6. Parallelise is only upto backupset but not on backuppiece.
In the log below, RMAN created 6 channel as per my controlfile, but only three channel is working per backupset. MY backupset has two backuppiece each so I expected all channel will be used in the validation, but it did not happen.
RMAN> validate backupset 388,466,529;
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=505 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=468 devtype=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: sid=476 devtype=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: sid=477 devtype=DISK
allocated channel: ORA_DISK_5
channel ORA_DISK_5: sid=487 devtype=DISK
allocated channel: ORA_DISK_6
channel ORA_DISK_6: sid=492 devtype=DISK
channel ORA_DISK_1: starting validation of datafile backupset
channel ORA_DISK_2: starting validation of datafile backupset
channel ORA_DISK_3: starting validation of datafile backupset
channel ORA_DISK_1: reading from backup piece /u05/TEST/full/full_TEST_20170309_df_321_a1rumci2_1_1
channel ORA_DISK_2: reading from backup piece /u05/TEST/full/full_TEST_20170309_df_399_cfrumep9_1_1
channel ORA_DISK_3: reading from backup piece /u05/TEST/full/full_TEST_20170309_df_477_etrumeug_1_1
channel ORA_DISK_1: restored backup piece 1
piece handle=/u05/TEST/full/full_TEST_20170309_df_321_a1rumci2_1_1 tag=TAG20170309T081146
channel ORA_DISK_1: reading from backup piece /u05/TEST/full/full_TEST_20170309_df_321_a1rumci2_2_1
channel ORA_DISK_2: restored backup piece 1
piece handle=/u05/TEST/full/full_TEST_20170309_df_399_cfrumep9_1_1 tag=TAG20170309T084945
channel ORA_DISK_2: reading from backup piece /u05/TEST/full/full_TEST_20170309_df_399_cfrumep9_2_1
channel ORA_DISK_3: restored backup piece 1
piece handle=/u05/TEST/full/full_TEST_20170309_df_477_etrumeug_1_1 tag=TAG20170309T085231
channel ORA_DISK_3: reading from backup piece /u05/TEST/full/full_TEST_20170309_df_477_etrumeug_2_1
channel ORA_DISK_1: restored backup piece 2
piece handle=/u05/TEST/full/full_TEST_20170309_df_321_a1rumci2_2_1 tag=TAG20170309T081146
channel ORA_DISK_1: validation complete, elapsed time: 00:00:42
channel ORA_DISK_2: restored backup piece 2
piece handle=/u05/TEST/full/full_TEST_20170309_df_399_cfrumep9_2_1 tag=TAG20170309T084945
channel ORA_DISK_2: validation complete, elapsed time: 00:00:42
channel ORA_DISK_3: restored backup piece 2
piece handle=/u05/TEST/full/full_TEST_20170309_df_477_etrumeug_2_1 tag=TAG20170309T085231
channel ORA_DISK_3: validation complete, elapsed time: 00:00:42
RMAN> exit
unused channel is shown even in the queries. I ran two queries while validation is running. you can see SID 500, 363 and 950 was not used through out the session.
SQL> @rman_mon.sql
TYPE STATUS FILENAME BUFFER_SIZE BUFFER_COUNT
--------- ----------- -------------------------------------------------------------------------------- ----------- ------------
INPUT IN PROGRESS /u05/TEST2/full/full_TEST2_20170309_df_321_a1rumci2_2_1 1048576 4
INPUT IN PROGRESS /u05/TEST2/full/full_TEST2_20170309_df_399_cfrumep9_2_1 1048576 4
INPUT IN PROGRESS /u05/TEST2/full/full_TEST2_20170309_df_477_etrumeug_2_1 1048576 4
USERNAME SERIAL# SID MODULE PROGRAM CLIENT_INFO EVENT
------------------------------ ---------- ---------- ------------------------------ ------------------------------------------------ ------------------------------ ----------------------------------------------------------------
SYS 499 468 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_2 RMAN backup & recovery I/O
SYS 234 476 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_3 RMAN backup & recovery I/O
SYS 500 477 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_4 SQL*Net message from client
TNS V1-V3)
SYS 363 487 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_5 SQL*Net message from client
TNS V1-V3)
SYS 950 492 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_6 SQL*Net message from client
TNS V1-V3)
USERNAME SERIAL# SID MODULE PROGRAM CLIENT_INFO EVENT
------------------------------ ---------- ---------- ------------------------------ ------------------------------------------------ ------------------------------ ----------------------------------------------------------------
SYS 193 505 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_1 RMAN backup & recovery I/O
6 rows selected.
SQL> @rman_mon.sql
TYPE STATUS FILENAME BUFFER_SIZE BUFFER_COUNT
--------- ----------- -------------------------------------------------------------------------------- ----------- ------------
INPUT IN PROGRESS /u05/TEST2/full/full_TEST2_20170309_df_399_cfrumep9_2_1 1048576 4
USERNAME SERIAL# SID MODULE PROGRAM CLIENT_INFO EVENT
------------------------------ ---------- ---------- ------------------------------ ------------------------------------------------ ------------------------------ ----------------------------------------------------------------
SYS 499 468 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_2 RMAN backup & recovery I/O
SYS 234 476 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_3 SQL*Net message from client
SYS 500 477 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_4 SQL*Net message from client
TNS V1-V3)
SYS 363 487 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_5 SQL*Net message from client
TNS V1-V3)
SYS 950 492 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_6 SQL*Net message from client
TNS V1-V3)
USERNAME SERIAL# SID MODULE PROGRAM CLIENT_INFO EVENT
------------------------------ ---------- ---------- ------------------------------ ------------------------------------------------ ------------------------------ ----------------------------------------------------------------
SYS 193 505 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_1 SQL*Net message from client
6 rows selected.
SQL> @rman_mon.sql
no rows selected
USERNAME SERIAL# SID MODULE PROGRAM CLIENT_INFO EVENT
------------------------------ ---------- ---------- ------------------------------ ------------------------------------------------ ------------------------------ ----------------------------------------------------------------
SYS 499 468 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_2 SQL*Net message from client
SYS 234 476 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_3 SQL*Net message from client
SYS 500 477 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_4 SQL*Net message from client
TNS V1-V3)
SYS 363 487 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_5 SQL*Net message from client
TNS V1-V3)
SYS 950 492 rman@VM01 ( rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_6 SQL*Net message from client
TNS V1-V3)
USERNAME SERIAL# SID MODULE PROGRAM CLIENT_INFO EVENT
------------------------------ ---------- ---------- ------------------------------ ------------------------------------------------ ------------------------------ ----------------------------------------------------------------
SYS 193 505 restore full datafile rman@VM01 (TNS V1-V3) rman channel=ORA_DISK_1 SQL*Net message from client
6 rows selected.
Query I used
SQL> !cat rman_mon.sql
set linesize 300
col filename for a80
select type, status,filename,buffer_size,buffer_count from v$backup_async_io where type <> 'AGGREGATE' and status='IN PROGRESS';
col username for a30
select username, serial#, sid, module,program, client_info,event from v$session where client_info like '%man channel%';
SQL>
My question is:
Is there a way to force rman to validate a backupset serially or even down to backup piece itself? I think if we can parallelize the validation upto backup piece, we can make better usage of the channel and faster completion of the backup script.
DB Version: 11.2.0.4
OS: Linux x86_64 Redhat 4.4
Current RMAN configuration of the live DB
Fileperset = 20
maxpiecesize = 3G
parallelism = 6
We are doing a direct disk backup to zfs NFS.
Message was edited by: 944772
Added RMAN current configuration of the Live DB.