Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Look Ma, outer join returns fewer rows than inner join!

mathguyDec 15 2017 — edited Jan 3 2018

Oracle version 12.1.0.2.0, Enterprise Edition.

A while back someone asked on this forum what must have been an interview question they had been asked: When does an outer join return fewer rows than the inner join? (presumably on the same tables and with the same join conditions). Or perhaps a variant - when does the left outer join return no rows, when the inner join does return rows.

Of course, we told them it's not possible.  We were mistaken.

SQL> create table t1 ( x varchar2(10) );

Table created.

SQL> create table t2 ( y varchar2(10) );

Table created.

SQL> insert into t1 values ( 'a' );

1 row created.

SQL> insert into t2 values ( null );

1 row created.

SQL> commit;

Commit complete.

SQL> select t1.x as x from t1 inner join t2 on t1.x like t2.y || '%' where t2.y is null;

X
----------
a

1 row selected.

SQL> select t1.x as x from t1 left outer join t2 on t1.x like t2.y || '%' where t2.y is null;

no rows selected

QED

I can only speculate here - the Optimizer sees the structure of the left outer join query and it thinks it's an anti-join (which indeed in this case should return no rows). The explain plan seems to support that - I see the option "ANTI" there, if that's what it means. But this is not an anti-join, even though it looks like one. Why not? Because null is not always null in Oracle - specifically for strings.  EDIT: As I found out later (see Reply 10 11 below), the bug can be reproduced with columns of NUMBER data type, and a comparison using > (greater-than) and NVL to wrap NULL from the second table. So this is not specific to strings, and to the LIKE comparison operator.

Here is a couple more fun things about this:

1.  Change the SELECT clause to select t1.x as x, t2.y as y from .....  -   now the outer join will produce the correct result.   So: we also have an example where by selecting a subset of columns, the number of rows in the output decreases. (Worse: the query result goes from non-empty to empty.)

2. Instead of storing tables t1 and t2, create them in a WITH clause:

SQL> with t1 as ( select cast('a'  as varchar2(10)) as x from dual ),
  2       t2 as ( select cast(null as varchar2(10)) as y from dual )
  3  select t1.x as x from t1 left outer join t2 on t1.x like t2.y || '%' where t2.y is null;

X
----------
a

1 row selected.

So the bug doesn't manifest itself when "test data" is created in a WITH clause - it only exists when the query reads from tables on disk. I expect that this will be quite nasty to debug.

Is this a known bug?  I don't recall seeing it discussed before. I just noticed it this morning, working on different solutions for another recent thread (see Replies 10-11): how to use IN and NOT LIKE operator together

Cheers,     -    mathguy

This post has been answered by mathguy on Dec 15 2017
Jump to Answer
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jan 12 2018
Added on Dec 15 2017
11 comments
1,908 views