Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Aggregate table design help

user13047999Oct 17 2023 — edited Oct 17 2023

I am using oracle 19c.

we load events data into oracle table and we need to build an agg table on daily basis. Here is the sample data.

create table cs_events (
session_date date,
session_id varchar2(100),
device varchar2(100),
channel varchar2(100),
event varchar2(100),
event_timestamp timestamp)
;

insert into cs_events values (date '2023-10-01','3f6e90d6-20231001101','desktop','paid search','home page viewed',timestamp '2023-10-01 6:01:00');
insert into cs_events values (date '2023-10-01','3f6e90d6-20231001101','desktop','paid search','product page viewed',timestamp '2023-10-01 6:01:00');
insert into cs_events values (date '2023-10-01','3f6e90d6-20231001101','desktop','paid search','product added to cart',timestamp '2023-10-01 6:02:00');
insert into cs_events values (date '2023-10-01','3f6e90d6-20231001101','desktop','paid search','checkout started',timestamp '2023-10-01 6:03:00');
insert into cs_events values (date '2023-10-01','3f6e90d6-20231001101','desktop','paid search','order completed',timestamp '2023-10-01 6:04:00');
insert into cs_events values (date '2023-10-01','3f6e90d6-20231002101','desktop','email','home page viewed',timestamp '2023-10-01 6:00:00');
insert into cs_events values (date '2023-10-01','3f6e90d6-20231002101','desktop','email','product page viewed',timestamp '2023-10-01 6:01:00')
;insert into cs_events values (date '2023-10-01','3f6e90d6-20231003101','desktop','email','product page viewed',timestamp '2023-10-01 6:01:00')
;insert into cs_events values (date '2023-10-01','3f6e90d6-20231003101','desktop','email','product added to cart',timestamp '2023-10-01 6:02:00');
insert into cs_events values (date '2023-10-01','3f6e90d6-20231003101','desktop','email','checkout started',timestamp '2023-10-01 6:03:00');
insert into cs_events values (date '2023-10-01','3f6e90d6-20231003101','desktop','email','order completed',timestamp '2023-10-01 6:04:00');

Currently we are using wide table approach and we created daily aggregate table as below.

create table cs_events_agg_daily as
select session_date, device, channel,
count(distinct session_id) sessions,
count(distinct case when event = 'home page viewed' then session_id end) home_page_sessions,
count(distinct case when event = 'product page viewed' then session_id end) product_page_sessions,
count(distinct case when event = 'product added to cart' then session_id end) add_to_cart_sessions,
count(distinct case when event = 'checkout started' then session_id end) checkout_sessions,
count(distinct case when event = 'order completed' then session_id end) purchase_sessions
from cs_events
group by session_date, device, channel
;

We share this agg table with other teams for data analysis purposes and this format is easy for querying as well. But the challenge is whenever we want to add a new metric or dimension to this agg table, we need to perform the below tasks

  • alter table to add a new column
  • run backfill job to populate data for this new column

Could you please let me know which is the best approach to build agg tables for these requirements?

wide table approach - which contains all dimensions and metrics in the row column format

tall table approach - adding each metric in a separate row as mentioned below.

session_date device channel metric_name metric_value

2023-10-01 desktop email sessions 2

2023-10-01 desktop email home_page_sessions 1

Comments
Post Details
Added on Oct 17 2023
6 comments
333 views