Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Solution design and performance - SQL or PL/SQL?

victorpFeb 16 2012 — edited Feb 20 2012
I'm hoping I can get some performance advice on the best approach to tackle my problem.

I've renamed things to hopefully make it a bit more understandable...

I have a set of records which keep track of a status at a given date for a range of points on a given road. Basically, I need the current bits to "shine through" eg
   |-----|  |----------|   1-jan-2008
          |--|              5-jun-2008
       |------|            1-jan-2009
           |----|           12-feb-2011

|--|---|---|----|------|-----|  (this is the "current" view of the data)
The table holding the 5 records above is essentially:
create table my_status(
  my_status_id number,
  status varchar2(50),
  road_id number,
  start_point number,
  end_point number,
  status_date date
);
so the 5 records would be
insert into my_status values(1, 'OPEN', 1, 20, 50, '01-jan-2008');
insert into my_status values(2, 'CLOSED', 1, 80, 150, '01-jan-2008');
insert into my_status values(3, 'OTHER', 1, 55, 85, '05-jun-2008');
insert into my_status values(4, 'OTHER', 1, 40, 90, '01-jan-2009');
insert into my_status values(5, 'OPEN', 1, 60, 100, '12-feb-2011');
The result required is to display the my_status_id that is current and the range that it applies to ie
current_from, current_to, my_status_id
20,39,1
40,59,4
60,100,5
101,150,2
151,200,0
so my_status_id of 3 doesn't show up as it has entirely been superseded by later records. Also note that 0-19 and 151-200 comes back as my_status_id = 0 which is just to show there is no record for that range.

This is just a very small set of the data as the calculation needs to be done for every road in the system. The roads are defined as follows:
create table road(
  road_id number,
  road_desc varchar2(100)
)

insert into road values (1,'Road ABC')
and also each road has a list of valid point:
create table road_bit(
  road_id number,
  start_pt number,
  end_pt number
)

insert into road_bit values (1,0,200)
insert into road_bit values (1,250,500)
So basically, I need to go through every road_bit record and get back the my_status_id and range it applies to (and 0 when no current record as per above).

The brute force approach is to look at each pt between the start_pt and end_pt of the road_bit and find the current my_status_id. You can then group these up to get a start and end range for each my_status_id (ie when the my_status_id changes/breaks). This is quite intensive processing and considering the total number of pts that need to be calculated is around 350 million it's not very fast! It's obviously fine on a smaller scale but the query time is linear compared to the number of pts you query. I've tried various approaches from building in PLSQL with a table function and also straight SQL but they take far too long. It's likely the end solution will be run overnight as a materialized view so I'm not too worried if it takes a few hours, I just can't have it taking a few days:-)

The current my_status_id for a given pt can be found as follows:
SELECT  nvl(max(s.my_status_id),0) my_status_id
FROM  my_status s
WHERE  s.road_id = :p_road_id
AND  :pt between s.start_point and s.end_point
AND  NVL(s.status_date,TO_DATE('01-JAN-0001','DD-MON-YYYY'))  = 
  (SELECT  MAX(NVL(z.status_date,TO_DATE('01-JAN-0001','DD-MON-YYYY')))
   FROM  my_status z
   WHERE z.road_id = s.road_id
   AND  :pt between z.start_point and z.end_point  )
I'm hoping there's a slightly smarter way to go about this which avoids essentially looping 350M times and running a costly query. I look forward to no doubt being baffled by some solutions!
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Mar 17 2012
Added on Feb 16 2012
11 comments
1,336 views