Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Query works correctly then returns incorrect results but no error!

735519Nov 19 2009 — edited Nov 24 2009
I'm a web application programmer, and have run into a very perplexing issue that I've been troubleshooting against an Oracle 10g database for almost 2 weeks. A query that should return approximately 70 rows will return correct results for a time, then will suddenly incorrectly start returning only 1 or 2 rows of the 70, without generating any errors. Altering some inconsequential part of the query (e.g. changing the indentation, or switching from uppercase to lowercase on a table or view name) might result in the correct number of rows being returned. Between the programmers, webmasters, and DBAs here, we're running out of ideas on how to troubleshoot this issue, and would appreciate any insight you might have. Why is it returning incomplete results some of the time? How can we get the query to return the right results every time?

Spec for query
I've got a somewhat complex query that finds groups of trips where a single person travels one or more times to the same destination, such that trips in the group have at most 30 days between the end of one trip and the beginning of the next trip, and the number of days between the earliest arrival date and the latest return date is greater than 150 days, and the latest return date is some time after 30 days ago. I'm using factored subqueries -- WITH subquery1 AS (SELECT ... FROM tripinfotable), subquery2 AS (SELECT ... FROM subquery1), etc. for about 230 lines of SQL code. The result set includes both the traveler and destination information, as well as details about each set of orders that contributed to the groups of trips. The query sifts through some 18K records, and returns approximately 70 rows of data, representing 30 travelers whose recent trip groups meet the criteria outlined here.

Issue
After developing this query in TOAD 9.7, and formatting it using TOAD's code formatter, I copied it into a Coldfusion 8 cfquery tag. When executing the query and dumping the results to a web page, I see 1 or 2 rows, but not the full 70 I was expecting. Note, the two rows are valid results, included in the full 70 rows, but woefully incomplete. Neither Oracle nor Coldfusion reported any errors either onscreen or in the Coldfusion log files. I altered the query to look for groups of trips in excess of 60 days instead of 150, and this time running the query in TOAD returned about 300 rows, while running it from the web page returned 14 rows.

Troubleshooting
My first thought was that Coldfusion had done something to the outbound query or the result set to prevent the data from displaying. So, I created a stored procedure and put the query in it. This would eliminate the need to pass it as an inline query from Coldfusion. I got two rows back.

I enabled the database activity log in the Coldfusion Administrator Data Source Names properties for that data source, examined the log file, and I saw the query, formatted exactly as it is in the CF code, being sent to Oracle, but only two records being logged in the returned data. I created a new table in the database for troubleshooting purposes, and rewrote the query to select the records into the new table. When I ran it and checked the table I saw that it contained 70 records. I wrote my test page to have both the select into query and the original inline query and ran it (after truncating the temp table) and got 2 records displayed on the screen, but 70 in the table. I copied the select into query and pasted it in the inline query and removed the select into bit, and this time I got 70 records on the screen and 70 in the table. Huh????? I went back to my original inline query and this returned 2 rows. I looked at the cfdump output, copied the query text info contained in each block -- the one that worked and the one that returned 2 rows, and compared them. They look identical, yet are returning different results.

The fact that I got valid data back at some point indicates to me that there's no problem with the data being returned to the Coldfusion application server. The result set is not changing from run to run.

I created a view and put the query there, and wrote my cfquery to simply be "select * from twa.vw_LongTermTdy". I got two rows back. I added a new cfquery that does "select * from vw_LongTermTDY" (last two characters upper case), and got 70 rows back. My DBA created a new view, and for each variation on case I've attempted to "select * from vw_LongTermTDY2", I've gotten 70 rows back. My most recent troubleshooting test page has five cfquery tags: the uppercase version of selecting * from the view first, then the lowercase, then a couple variations of selecting * from the second view, then the inline version of the query. It returned 70, 2, 70, 70, and 2 rows. As part of the troubleshooting our DBA has been auditing a refresh of the web page to see if anything odd shows up (nothing has). Last night, before one of the runs, he flushed the memory in the database. As soon as he did that, the next refresh of the page showed 70 records for all five queries. Huh????? Further variations resulted in the correct number of rows each time.

So, last night I wrote a report page in the travel application using the inline query and got it working to display all the relevant information. I checked it from my login and it displayed correctly, with all 70 records. I had another programmer check and he saw all 70 records. This morning, I checked it again, and saw 70 records. When my fellow programmer checked a few minutes later, he saw only 2 records. Immediately, I refreshed my view and saw only 2 records. I reloaded my test page and saw 70, 70, 70, 70, 2 records (first 4 are the views, and the last was the identical SQL inline). I asked my DBA to flush memory again, and as soon as he did, all 5 queries on the test page return 70 rows, as well as the report page. A couple hours later, I asked the other programmer to check again, and he got only 2 records -- refreshing my report page shows only 2 records. Each time it drops to 2 rows, it's the same 2 rows.

Ben Franklin defined insanity as repeating the same steps over and over and expecting different results. I'm about to make him my patron saint...

Our plausible explanation, is that while the query itself is not case sensitive, Oracle hashes the text and caches the query to improve performance if it's needed again. For reasons we have yet to determine, that query sometimes gets corrupted somehow (it's the same way each time it does), and the corrupted version gets stored against that hash. At this point, we have yet to see it self-correct. The only way we can get it to behave correctly is to have the DBA purge memory in the database. This does not last very long though, and is not an acceptable long term solution. What is causing the query to return a partial recordset? How can we get it to work correctly every time? Where should we be looking that we haven't already to try to figure this out?

Thanks for any insight you might be able to share with me!

-matt
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Dec 22 2009
Added on Nov 19 2009
7 comments
6,314 views