Hi Folks,
I am Tasked with implementing Partitioning of quite a certain Table; I have a concept I like to share and hear opinions about it if you may.
First of all the current situation:
Say Table X has the following columns (schematic):
| Column Name | Data Type | Description |
|---|
| ID | NUMBER | Primary Key (Surrogate Key) |
| KEYTYPE | VARCHAR2 | Type of the Record; there are about 10 different types say A,B,C and the like |
| NREF1 | NUMBER | Foreign Key; depending on KEYTYPE this can have different Masters (and thus there is no real foreign key on the column) - composite is also possible (with other XRefN) |
| VREF1 | VARCHAR2 | Foreign Key; depending on KEYTYPE this can have different Masters (and thus there is no real foreign key on the column) - composite is also possible (with other XRefN) |
| VREF2 | VARCHAR2 | Foreign Key; depending on KEYTYPE this can have different Masters (and thus there is no real foreign key on the column) - composite is also possible (with other XRefN) |
| XREFN | something | there are more ref columns, for the sake of example the above should suffice |
| RECORD_DATE | DATE | Timestamp of a record which is populated depending on KEYTYPE |
| VVAL1-N | VARCHAR2 | Several String Value columns for storing string values |
| NVAL1-N | NUMBER | Several Numeric Value columns for storing number values |
| DVAL1-N | DATE | Several Date Value columns for storing date values |
(In case anyone is wondering: this is a very old take on storing unstructured data in some fashion).
Basically depending on KEYTYPE the REF columns are populated; e.g.
- when KEYTYPE = A the NREF1 column contains a reference (= a foreign key) to the PK of TABLE_A
- when KEYTYPE = B the VREF1 and VREF2 columns contains a reference to the (composite) PK of TABLE_B
- when KEYTYPE = C the VREF1 column contains a reference to the PK of TABLE_C
- when KEYTYPE = X then several XREFN columns contains references (I guess you get the concept)
- and so on
- the XVALN columns contain unstructured data stored to the master depending on the KEYTYPE (like readings of a scale, the length of a shelf,...)
- indices are put on a mixture of columns; e.g. there is an index on NREF1, there is another one on VREF1, there is another one on VREF1 and VREF2,...
so basically I have an ARC on the table with KEYTYPE as distinctor - but unfortunately the foreign key data is put in meaningless XREFN columns and mixed up making a real foreign key impossible - at least with the data structure as it is now.
A few facts about the data:
- KEYTYPE is a fixed list of possible values making it a good candidate for list partitioning; however: a lot of customers use only one KEYTYPE; some use 2 different KEYTYPES, some use 3. Nobody uses all the different KEYTYPES. So list partitioning would only make sense if a customer would use more then one KEYTYPEs (and would get better the more KEYTYPES are in use) but wouldn't make sense if only one KEYTYPE is in use.
- RECORD_DATE is a timestamp which is used for some KEYTYPES so interval partitioning could be possible. RECORD_DATE is only populated in a few KEYTYPES so I can't partition by them if a customer uses a KEYTYPE where RECORD_DATE is not populated
- In almost all cases as already said the data is stored as detail to a master making it a candidate for reference partitioning. The XREFN structure of the key data however prevents that (for now) - also reference partitioning is hard if there are 2 (or more) different KEYTYPEs with the very same data distribution - one KEYTYPE would benefit from the reference partitioning whereas the others would suffer from it. And as for now I am not aware that I can use 2 different reference partitioning strategies for the very same table.
I am pretty sure it is a very good Idea to break those XREFN columns apart in real columns so I can put foreign key constraints on them - as said this doesn't provide me with a partition strategy in all cases, but in some of them. I could do different partitioning strategies depending on the scenario, But this sounds somewhat complicated.
in short: what I would need (I guess) would be a composite partitioning (LIST-something I guess) where the sub partition strategy is different for each partition (e.g. partition KEYTYPE_A has subpartition intervall on RECORD_DATE, partition KEYTYPE_B has subpartition reference partition on the migrated NREF1 column,.....). But I am not aware that this is possible as well - if it were I am eager to listen.
So my solution to the whole tragedy would be: break the whole table apart. I'd make a Table for each and every Keytype in the current table. The XREFN columns would be broken apart in the "real" columns as well (and of course only put in place in the keytype tables where they are needed). Those Tables are then partitioned depending on their nature as I'd do for the distinct keytypes.
The original table would be kept in place - but only the ID and the KEYTYPE columns would survive in it - the KEYTYPE_TABLEs have a Surrogate Primary key as well which is a 1:1 relation to the original table in order to have the ID unique among all Tables.
The Table already is accessed via a PL/SQL API only, so I would hide the phyiscal implementation of those Tables and I could even create a view for backward compability for those who like to select on that Table manually (an instead of trigger might be possible as well which then would in turn call the API).
Breaking the Table apart would have those Benefits:
- I can do different Partitioning Strategies for each Keytype. I could even do a composite partitioning for a keytype for all I care
- I don't have several different partitioning strategies for one Table out there for different customers- at least from what I have understood of the concept of that Table. The Partitioning Strategy for the KEYTYPE Subtable can be fixed again, and is the very same for each and every customer. In some configurations some Tables might contain data, in some they won't.
- The Indices on the XREFN columns could be put on the Tables where they belong (this is not so different from partitioning, but anyway). It is not necessary to maintain an index for values of NREF1 when the "Foreign Key" Value is based on NREF1 and VREF1 which would make the different indices smaller.
- I am even playing with the thought to store the XVALN columns as JSON Data which was introduced in 12c - but as for now I have to support 11.2.0.4 Databases (and onward) but that could change. Does anyone have experience with that feature?
Or does anyone have a better Idea? I am all ears for suggestions.
As already said, I have to support 11.2.0.4 for now, but if there is a good solution which works on 18c and onwards I am eager to hear that as well .
Thanks in Advance!
cheers,
Christian