Hello, I am a data analyst at a healthcare organization.
One of the common problems we have is to identify all the patients that were present at a given date.
Any query that uses this will run extremely slow.
The table includes the dates the patient was admitted and discharged, but not the dates in between.
table Encounters e
.PatientID
.AdmitDate
.DischargeDate
SELECT distinct e.patientID
FROM Encounters e
WHERE e.AdmitDate <= [date]
AND e.DischargeDate >= [date]
So lets say I pick yesterday as the [date]. The query optimizer will run an index scan on the discharge date starting yesterday.
Then it will filter the remaining rows for the Admit Date. No problem!
But if I were to pick a date say 5 years ago, it would have to read every record with a discharge date between then and now. (some patients are in long-term care, where they may be housed for a year or more)
AdmitDate <----------------------------------[date]
DischargeDate [date]------------------------------------->
Both e.AdmitDate and e.DischargeDate are indexed, but not combined. (It is a vendor hosted database) But I am not sure how much a compound index would help, you would still have to read a ton of records.
I am sure others must have dealt with this problem. How would you approch the solution? Is there a name for this particular problem?