Hi Experts,
As per my understanding, incremental load is only the new data (with insert append or incremental load (update/insert or merge, with or without SCD behaviour)).
While look into KM code of the IKM's the below is my understanding:
Incremental Update : Based upon the PK defined in the target data store, this KM will check row by row and do insert/update (will capture changed data).
Control Append : Blindly do a bulk insert in to target table and no changed data will be captured. It wont do truncate and insert whereas to aviod duplicates in data, have a PK defined in target table so whenever duplicate data comes it can be prevented or go for CKMs.
Now my doubt is,
When using incremental update as KM :The scenario is I have one incremental load today, which inserted for example 200000 records today and tomorrow additional 200000 records added (may include updates to some of the rows in previous loaded 2,00,000 records. Now it will scan 4,00,000 (yesterday + todays) and look for changes , I mean to be updated or to be inserted
Because as per my understanding this KM will process row by row all the records (Is my understanding correct ?) . If it reads every record and look for a change or no change i feel that its time and performance issue ? Is CDC is right approach in this scenario or implementing SDC on all columns in table ?
With respect to huge number of records coming on daily basis, If IKM incremental update checking all records for update or insert or no change, i feel its not worth performance wise and huge time to compare between source and target. Can this KM eliminate comparing itself the records from source to target those does not have any change in any column value from previous load ?
sorry if this is silly question. Just trying to figure out which can be best incremental load strategy especially when I have millions of records coming into source on daily basis.
PS; I am remembering ealier JeromeFr our expert member in community, pointed out Partioned Exchange to process only process particular month data when you maintain partitioned tables at database level.
Best Regards
asp