Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Change Date using LAG or something else

user-zwrmqMay 18 2023 — edited May 18 2023

I hope this post is not so long that no one reads it, I tend to over explain myself.

Hi, I'm trying to use LAG function in a sub query to find the date of change for a specific column but the internal business Oracle tool that I use is having a problem processing so I'm trying to find a more efficient method. The tool that i'm restricted to using is oracle based, but I'm sorry I do not know what version and I test all of my SQL in AquaDataStudio before pasting the SQL to my other tool. My query runs in about 2 minutes in ADS but in my experience it can take as long as 30 minutes in my other tool for other queries that take 2 minutes in ADS.

The audit table that I'm querying contains ~30M rows with ~50 columns and my final results contain < 7k rows, so I would like to figure a way to skinny down my subquery.

right now I'm using the LAG function in my subquery and then my main query contains the WHERE clause that limits of changes in the past 5 years:

WHERE prev_group != new_group AND change_date > CURRENT_DATE - 5 YEARS;

For this question I have removed irrelevant tables, columns. Here is my query.

SELECT *
FROM (
    SELECT DISTINCT
    TDAD.DAD_DRG_NDC_ID ID
    ,LAG(TDAD.DAD_GPI_CD) OVER (PARTITION BY TDAD.DAD_DRG_NDC_ID
        ORDER BY TDAD.DAD_LAD_TS) prev_group
    ,TDAD.DAD_GPI_CD new_group
    ,CAST (TDAD.DAD_LAD_TS AS DATE) change_date
    FROM DVV.TDAD_DRUG_AUDIT TDAD
    WHERE TDAD.DAD_DRG_NDC_ID = '2144401' --THIS LINE IS ONLY HERE FOR THE EXAMPLE
    )A
WHERE prev_group != new_group
    AND change_date > CURRENT_DATE - 5 YEARS;

I'm using a specific ID as an example and normally this is not limited to a specific ID. This data is accurate.

If I look at the results from just my subquery, this is what it looks like and the highlighted record is the one i'm expecting to see in my final results (with a change_date = 10/2/2018).

But if I try to limit my subquery to the past 5 years, this is what I get and I think contains an additional error because row 2 does not seem accurate, but what I'm concerned with is that it does not produce the accurate 10/2/2018 record (as seen above).

SELECT DISTINCT
TDAD.DAD_DRG_NDC_ID ID
,LAG(TDAD.DAD_GPI_CD) OVER (PARTITION BY TDAD.DAD_DRG_NDC_ID
    ORDER BY TDAD.DAD_LAD_TS) prev_group
,TDAD.DAD_GPI_CD new_group
,CAST (TDAD.DAD_LAD_TS AS DATE) change_date
FROM DVV.TDAD_DRUG_AUDIT TDAD
WHERE TDAD.DAD_DRG_NDC_ID = '2144401' --THIS LINE IS ONLY HERE FOR THE EXAMPLE
   AND TDAD.DAD_LAD_TS > CURRENT_DATE - 5 YEARS

And then if I rerun the subquery excluding the LAG function and the date exclusion this is what I see:

So it seems the WHERE in my subquery is keying on the CHANGE_DATE from the previous record (12/8/2017).

I hope this makes sense and I hope someone can follow this. Thanks.

Comments
Post Details
Added on May 18 2023
17 comments
1,161 views