I have a question, but rather than start a new thread, I thought my question might be most closely related to this thread, because I'm trying to relate Oracle's "single point of failure" guarantee to disk arrays.
I have been involved with Oracle since version 6 and am very comfortable with the architecture (the relationship of control files, redo log files and data files) which guarantees the recovery of all committed transactions after any single point of failure.
Here's the question:
We are building a data warehouse (est 2TB in size) a genuinely read-only application except for the nightly loads. This is single instance Oracle. The storage array consists of 16 disks, each 500 GB, giving 8TB total physical space.
We have not configured the disk array yet. I'm picking up this task from the previous DBA and reviewing his design which reserves 2 disks for hot spares, mirrors the remainder as RAID 1+0, and then gathers these into a single logical volume, giving 3.5TB of useable space (excluding the hot spares). His design is
Disk 1 thru 14 (500 GB each x 14, RAID 1+0 = 3.5TB useablespace), single logical volume holding:
control01.ctl
control02.ctl
control03.ctl
redo logs
all tablespaces (SYSTEM, SYSAUX, USERS, TEMP, etc.)
Disk 15: hot spare
Disk 16: hot spare
My concern is about the single logical volume holding ALL the control files and ALL the redo log files. Being trained with Oracle 6, I learned to make sure that the three control files were on physically separate drives, and that the redo logs were mirrored across two separate disk drives. But with a single huge logical volume, how can you tell that this has been achieved?
The problem that I foresee in the current storage design is that the logical volume becomes the singe point of failure and that the data warehouse will be unuseable and unrecoverable if ANY of the 14 disks in the single logical volume fails. Yes, the hot spares are supposed to kick in, but this still means that I am relying completely on the disk array to keep the data warehouse running and that Oracle itself will no longer have any control.
I would consider something like this instead:
Disk 1 (500GB), not mirrored:
control01.ctl, redo logs copy 1
Disk 2 (500GB), not mirrored:
control02.ctl, redo logs copy 2,
Disk 3 thru 14 (500 GB each x 12, RAID 1+0 = 3TB useablespace), single logical volume holding:
control03.ctl
all tablespaces (SYSTEM, SYSAUX, USERS, TEMP, etc.)
Disk 15: hot spare
Disk 16: hot spare
Please comment on what has happened to Oracle's recommended disk configurations and guarantee of recovery in this day and age where, it seems, we literally put all of our data eggs into one large data basket. Is it now standard practice to do this? Or would my proposed disk layout be safer? Is there a MetaLink doc I should read?
-- Chris Curzon