Hi everyone,
I need assistance with optimizing a query, but I'm not sure where to begin.
I have two tables: MSG_EXC_REC, which represents messages, and MOVEMENT, which represents movements.
Each movement can be associated with multiple messages.
The join between these two tables is based on the columns MRN, DOMAIN, and ENV.
Both tables are partitioned.
MSG_EXC_REC is partitioned as below
PARTITION BY RANGE ("M_MIN_DATE") INTERVAL (NUMTOYMINTERVAL(1, 'MONTH')
PARTITION "BEFORE_2010" VALUES LESS THAN (TIMESTAMP' 2010-01-01 00:00:00')
Movement is partitioned as below
PARTITION BY RANGE ("MIN_DATE") INTERVAL (NUMTOYMINTERVAL(1, 'MONTH'))
PARTITION "BEFORE_2010" VALUES LESS THAN (TIMESTAMP' 2010-01-01 00:00:00')
There are also the below indexes :
CREATE UNIQUE INDEX "IDX_MOVEMENT_MRN" ON "MOVEMENT" ("MRN", "DOMAIN", "ENV", "CRN");
CREATE UNIQUE INDEX "CSMIS2_USER"."IDX_MOVEMENT_MRN" ON "MOVEMENT" ("MRN", "DOMAIN", "ENV", "CRN");
CREATE INDEX "IDX_MSG_EXC_REC_MIN_DATE" ON "MSG_EXC_REC" ("M_MIN_DATE");
CREATE INDEX "IDX_MSG_EXC_REC_MRN" ON "MSG_EXC_REC" ("MRN") ;
CREATE INDEX "IDX_MSG_EXC_REC_MRN_DOMAIN_ENV" ON "MSG_EXC_REC" ("MRN", "DOMAIN", "ENV");
CREATE INDEX "IDX_MSG_EXC_REC_MSG_TYPE_MIN_DATE" ON "MSG_EXC_REC" ("MSG_TYPE", "M_MIN_DATE")
I want to find all the messages in a specific range and domain. From these messages I find the movement (mrn) and retrieve also messages having the same mrn, domain, env.
My initial query is the below :
-- Query 1
SELECT /*+ FIRST_ROWS(2147483647) */ DISTINCT(CCN_UNIQUE_ID),MRN,SENDER,RECIPIENT,MSG_TYPE,MSG_SIZE,MSG_SEND_DATE,MSG_REC_DATE,COD_SEND_DATE,COD_REC_DATE,COA_SEND_DATE,COA_REC_DATE,EXP_SEND_DATE,EXP_REC_DATE,EXC_SEND_DATE,EXC_REC_DATE,MSG_ID,CORREL_ID,MSG_SEND_RT_CODE,MSG_REC_RT_CODE,IS_IECA,CRN,OP_MODE_SENDER,OP_MODE_RECEIVER,M_MIN_DATE,STATE,TED_INVOLVED
FROM ( SELECT RES.CCN_UNIQUE_ID,RES.MRN,RES.SENDER,RES.RECIPIENT,RES.MSG_TYPE,RES.MSG_SIZE,RES.MSG_SEND_DATE,RES.MSG_REC_DATE,RES.COD_SEND_DATE,RES.COD_REC_DATE,RES.COA_SEND_DATE,RES.COA_REC_DATE,RES.EXP_SEND_DATE,RES.EXP_REC_DATE,RES.EXC_SEND_DATE,RES.EXC_REC_DATE,RES.MSG_ID,RES.CORREL_ID,RES.MSG_SEND_RT_CODE,RES.MSG_REC_RT_CODE,RES.IS_IECA,RES.CRN,RES.OP_MODE_SENDER,RES.OP_MODE_RECEIVER,RES.M_MIN_DATE, MOV.STATE, MOV.TED_INVOLVED
FROM MOVEMENT MOV
INNER JOIN MSG_EXC_REC REQ ON REQ.MRN = MOV.MRN AND REQ.DOMAIN = MOV.DOMAIN AND REQ.ENV = MOV.ENV
INNER JOIN MSG_EXC_REC RES ON RES.MRN = MOV.MRN AND RES.DOMAIN = MOV.DOMAIN AND RES.ENV = MOV.ENV
WHERE MOV.DOMAIN = 'NCTS'
AND REQ.DOMAIN = 'NCTS'
AND RES.DOMAIN = 'NCTS'
AND REQ.M_MIN_DATE BETWEEN TO_TIMESTAMP('01/01/2024 00:00:00', 'DD/MM/YYYY hh24:mi:ss') AND TO_TIMESTAMP('01/03/2024 00:00:00', 'DD/MM/YYYY hh24:mi:ss')
ORDER BY MOV.MIN_DATE DESC,REQ.MSG_SEND_DATE DESC ) QRY;
It's execution plan :

After testing different queries without creating any new index, I found that the following query has the lowest cost:
-- Query 2
SELECT /*+ FIRST_ROWS(2147483647) */ REQ.CCN_UNIQUE_ID,REQ.MRN,REQ.SENDER,REQ.RECIPIENT,REQ.MSG_TYPE,REQ.MSG_SIZE,REQ.MSG_SEND_DATE,REQ.MSG_REC_DATE,REQ.COD_SEND_DATE,REQ.COD_REC_DATE,REQ.COA_SEND_DATE,REQ.COA_REC_DATE,REQ.EXP_SEND_DATE,REQ.EXP_REC_DATE,REQ.EXC_SEND_DATE,REQ.EXC_REC_DATE,REQ.MSG_ID,REQ.CORREL_ID,REQ.MSG_SEND_RT_CODE,REQ.MSG_REC_RT_CODE,MOV.IS_IECA,REQ.CRN,REQ.OP_MODE_SENDER,REQ.OP_MODE_RECEIVER,REQ.M_MIN_DATE, MOV.STATE, MOV.TED_INVOLVED
FROM MOVEMENT MOV
INNER JOIN MSG_EXC_REC REQ ON REQ.MRN = MOV.MRN AND REQ.DOMAIN = MOV.DOMAIN AND REQ.ENV = MOV.ENV
WHERE REQ.MRN IN (SELECT DISTINCT RES.MRN
FROM MSG_EXC_REC RES
WHERE RES.DOMAIN = 'NCTS'
AND RES.M_MIN_DATE BETWEEN TO_TIMESTAMP('01/01/2024 00:00:00', 'DD/MM/YYYY hh24:mi:ss') AND TO_TIMESTAMP('01/03/2024 00:00:00', 'DD/MM/YYYY hh24:mi:ss')
)
AND MOV.DOMAIN = 'NCTS'
ORDER BY MOV.MIN_DATE DESC,REQ.MSG_SEND_DATE DESC ;
It's plan is

I then attempted to create a function-based index on the M_MIN_DATE column, which is also the basis for the partitioning.
The new created index is the below :
CREATE INDEX "IDX_MSG_EXC_REC_MINUTE_LEVEL" ON "MSG_EXC_REC" (TRUNC("M_MIN_DATE", 'MI')) ;
So, I have modified the query s below (knowing that I will have minute-level precision on the data)
-- Query 3
SELECT /*+ FIRST_ROWS(2147483647) */ REQ.CCN_UNIQUE_ID,REQ.MRN,REQ.SENDER,REQ.RECIPIENT,REQ.MSG_TYPE,REQ.MSG_SIZE,REQ.MSG_SEND_DATE,REQ.MSG_REC_DATE,REQ.COD_SEND_DATE,REQ.COD_REC_DATE,REQ.COA_SEND_DATE,REQ.COA_REC_DATE,REQ.EXP_SEND_DATE,REQ.EXP_REC_DATE,REQ.EXC_SEND_DATE,REQ.EXC_REC_DATE,REQ.MSG_ID,REQ.CORREL_ID,REQ.MSG_SEND_RT_CODE,REQ.MSG_REC_RT_CODE,MOV.IS_IECA,REQ.CRN,REQ.OP_MODE_SENDER,REQ.OP_MODE_RECEIVER,REQ.M_MIN_DATE, MOV.STATE, MOV.TED_INVOLVED
FROM MOVEMENT MOV
INNER JOIN MSG_EXC_REC REQ ON REQ.MRN = MOV.MRN AND REQ.DOMAIN = MOV.DOMAIN AND REQ.ENV = MOV.ENV
WHERE REQ.MRN IN (SELECT DISTINCT RES.MRN
FROM MSG_EXC_REC RES
WHERE RES.DOMAIN = 'NCTS'
AND TRUNC(RES.M_MIN_DATE, 'MI') BETWEEN TO_TIMESTAMP('01/01/2024 00:00', 'DD/MM/YYYY HH24:MI')
AND TO_TIMESTAMP('01/03/2024 00:00', 'DD/MM/YYYY HH24:MI')
AND RES.M_MIN_DATE BETWEEN TO_TIMESTAMP('01/01/2024 00:00:00', 'DD/MM/YYYY hh24:mi:ss')
AND TO_TIMESTAMP('01/03/2024 00:00:00', 'DD/MM/YYYY hh24:mi:ss')
)
AND MOV.DOMAIN = 'NCTS'
ORDER BY MOV.MIN_DATE DESC,REQ.MSG_SEND_DATE DESC;
The execution plan of the above query is :

We can see that the cost has been reduced a lot.
But when I check it with an outer select count(*) I don't get any huge time reduction which I was expecting
Query1 : 384sec
Query2 : 227sec
Query3 : 336sec
Do you have any suggestions for improvement?
Thanks in advance