Thread: Joining many tables


Permlink Replies: 25 - Pages: 2 [ 1 2 | Next ] - Last Post: Jan 30, 2008 7:23 AM Last Post By: Brian Tkatch
user616556

Posts: 1
Registered: 01/24/08
Joining many tables
Posted: Jan 24, 2008 10:56 AM
Click to report abuse...   Click to reply to this thread Reply
In general, if I have a query joining many tables, is it better to try and create one query with all the joins

i.e
select * from table1 t1, table t2, table3 t3, table4 t4, table5 t5
where t1.id = t2.id and t2.id = .......

use sub-queries

select * from (
(select id, * from table1 t1, table2 t2) s1,
(select id, * from table3 t3, table4 t4) s2
where s1.id = s2.id
devmiral

Posts: 2,402
Registered: 11/16/06
Re: Joining many tables
Posted: Jan 24, 2008 11:04 AM   in response to: user616556 in response to: user616556
Click to report abuse...   Click to reply to this thread Reply
1st one with selected column from all diff. table.NO sub query if can use without sub query
Carlovski

Posts: 72
Registered: 09/29/06
Re: Joining many tables
Posted: Jan 28, 2008 7:13 AM   in response to: devmiral in response to: devmiral
Click to report abuse...   Click to reply to this thread Reply
Caveat here though - sometimes creating an in-line view in that fashion can greatly improve the readibility of the code, you can logically group the tables e.g to join the 'orders' data which may be agregrated from several tables, to the customers data which again may come from several tables.
BluShadow

Posts: 12,879
Registered: 09/21/05
Re: Joining many tables
Posted: Jan 28, 2008 7:33 AM   in response to: Carlovski in response to: Carlovski
Click to report abuse...   Click to reply to this thread Reply
Caveat here though - sometimes creating an in-line
view in that fashion can greatly improve the
readibility of the code, you can logically group the
tables e.g to join the 'orders' data which may be
agregrated from several tables, to the customers data
which again may come from several tables.

Or even better, use ANSI SQL syntax then it's very readable as to what the joins are...

select *
from table1 t1 JOIN table2 t2 ON (t1.id = t2.id)
               JOIN table3 t3 ON (t2.t3_id = t3.id)
               JOIN table4 t4 ON (t3.t4_id = t4.id)
               JOIN table5 t4 ON (....
WHERE t2.txt = 'FRED'
etc.


Message was edited by:
BluShadow
Brian Tkatch

Posts: 379
Registered: 11/29/07
Re: Joining many tables
Posted: Jan 28, 2008 9:58 AM   in response to: user616556 in response to: user616556
Click to report abuse...   Click to reply to this thread Reply
I don't understand why you would want to use the second one.
Vadim Tropashko

Posts: 505
Registered: 01/10/01
Re: Joining many tables
Posted: Jan 28, 2008 10:37 AM   in response to: user616556 in response to: user616556
Click to report abuse...   Click to reply to this thread Reply
s1 and s2 are not subqueries, they are inline views. I hope the optimizer would be able to merge them -- the naive execution of the query with cartesian products would be disastrous.

Theoretically, a flat "select...from...where..." query with joins is superior from performance perspective to any other form. This is why optimizer tries different trasformations (such as unnesting suqueries and merging inline views) which convert a query into this desired "select...from...where..." form. The disclaimer "theoretically" doesn't mean that join execution is faster than tuple iteration semantics of nested subquery. It means that there are more join orders to consider. In practice, however, a developer/DBA already knows what join order might have satisfactory performance, so it might make sense to make it explicit for optimizer, say by writing a query in a special way, or even giving a hint.

ANSI SQL syntax is a sugarcoating that really adds nothing to the conventional "select...from...where...". Every expert in the field (Anthony Molinaro, Wolfgang Breitling, Mladen Gogala to name a few) intuitively feels that the old comma separated syntax is superior. Here is some formal basis for it.

When we write

select ... from ... where ...

this is nothing more than Relational Algebra expression

project(select(cartesian_product()))

There is a theory behind a simple fact that an overwhelming majority of queries fit into this simple pattern. The Alice book page 55 describes a normal form for project-select-cartesian_product algebra, which says that under pretty general condition one may rewrite an expression with an arbitrary sequence of projections , selections, and cartesian products into a single project(select(cartesian_product())). This is why select ... from ... where ... proved to be so powerful.

Note that all predicates fit into the same place — the where clause. And there goes ANSI style join syntax which (in naive attempt to imitate Relational algebra) starts differentiating various sorts of predicates. Many people like Wolfgang found this differentiation artificial, but if you ask me, there is no fundamental difference even between selection and join!

William Robertson

Posts: 6,652
Registered: 06/17/98
Re: Joining many tables
Posted: Jan 28, 2008 11:03 AM   in response to: BluShadow in response to: BluShadow
Click to report abuse...   Click to reply to this thread Reply
I'll second that vote for the ANSI sugar-coating. It places join predicates next to the tables and makes it harder to miss one (you have to specify a cross join explicitly).

I would also recommend writing all predicates the same way around, purely for readability (the optimizer doesn't care, of course). Other people's brains seem to be wired differently to mine on this issue though.

You don't need the brackets around the ON clause btw.
Vadim Tropashko

Posts: 505
Registered: 01/10/01
Re: Joining many tables
Posted: Jan 28, 2008 11:20 AM   in response to: William Robertson in response to: William Robertson
Click to report abuse...   Click to reply to this thread Reply
So let's focus on the problem of accidental cartesian product. Before executing the query, the optimizer analyzes the query join graph and, in principle, can give a warning each time when the graph is disjoint. To suppress the warning the user might be required give a hint (which naturally fits into the where clause) that the cartesian product is expected.

Admittedly this enhancement request have almost no chance to get supported by RDBMS (although competitor RDBMS do have warnings!). Still we can enhance front end development tools: What do you think about SQL Developer displaying such a warning dialog box before submitting the cartesian product query to execution?
Brian Tkatch

Posts: 379
Registered: 11/29/07
Re: Joining many tables
Posted: Jan 28, 2008 11:43 AM   in response to: William Robertson in response to: William Robertson
Click to report abuse...   Click to reply to this thread Reply
Once we're voting... :)

i vote against the ANSI syntax. Limitations belong in the WHERE clause, not in the FROM clause. And, it makes it much harder to read overall.

Nothing beats a good WHERE clause.
Sentinel

Posts: 1,167
Registered: 03/13/07
Re: Joining many tables
Posted: Jan 28, 2008 12:47 PM   in response to: Brian Tkatch in response to: Brian Tkatch
Click to report abuse...   Click to reply to this thread Reply
Whether you prefer the ANSI Join syntax or the old style (+) syntax, the ANSI syntax is more expressive and allows for additional join types such as a full outer join that are not available using the (+) operator in the where clause.

Further using the ANSI Join syntax facilitates a clear separation of the join criteria from limiting factors, not to mention the fact that they are more portable.
Vadim Tropashko

Posts: 505
Registered: 01/10/01
Re: Joining many tables
Posted: Jan 28, 2008 12:55 PM   in response to: Sentinel in response to: Sentinel
Click to report abuse...   Click to reply to this thread Reply
Yes, ANSI style outer join is more expressive. Yet, how often do you write outer join queries? I remember writing left outer join once in the entire 2007 year! I never feel a need writing a full outer join query. So, do you think a minuscule benefit of outer join overweight the horror of NATURAL JOIN clause? (This difference between the theory and practice is remarkable: natural join is one of the nicest operations in Relational Algebra).
Sentinel

Posts: 1,167
Registered: 03/13/07
Re: Joining many tables
Posted: Jan 28, 2008 3:29 PM   in response to: Vadim Tropashko in response to: Vadim Tropashko
Click to report abuse...   Click to reply to this thread Reply
I must admit that I never use natural joins my self and stick strictly to inner, left, right, and full joins, and so far I've only ever found full joins to be usefull when reconciling data, but that's the exact operation that drove me to learning the ANSI syntax in the first place. Now that I've gotten used to it (and it did take a while) I find that I really prefer it's clarity, but then that's a personal preference ;)
William Robertson

Posts: 6,652
Registered: 06/17/98
Re: Joining many tables
Posted: Jan 28, 2008 11:51 PM   in response to: Vadim Tropashko in response to: Vadim Tropashko
Click to report abuse...   Click to reply to this thread Reply
I agree with you on the horror of Natural Joins. The idea of joins depending on a naming convention is deeply unsafe, however neat it looks in textbooks where you join on CITY. It would be more "natural" to me if it depended on a foreign key relationship between the two tables.
mst@pfa.dk

Posts: 315
Registered: 09/05/06
Re: Joining many tables
Posted: Jan 29, 2008 1:21 AM   in response to: Brian Tkatch in response to: Brian Tkatch
Click to report abuse...   Click to reply to this thread Reply
I'll vote for the ANSI as well.

Limitations are still in the WHERE clause - join conditions are not limitations, they are join conditions :-)

Even though we are used to the Oracle version, why not go with the flow and make our SQL's more compatible and readable to SQL Server folks, DB2 folks etc? We do not live in a world of our own anymore ....

Mette
BluShadow

Posts: 12,879
Registered: 09/21/05
Re: Joining many tables
Posted: Jan 29, 2008 2:15 AM   in response to: mst@pfa.dk in response to: mst@pfa.dk
Click to report abuse...   Click to reply to this thread Reply
Firstly, yes I agree that ANSI syntax can be seen as sugar coating but...

Brian Tkatch
i vote against the ANSI syntax. Limitations belong in the WHERE clause, not in the FROM clause. And, it makes it much harder to read overall.

Nothing beats a good WHERE clause.


A join is not a limitation.
If you see my example above the join clauses are where the tables are specified and the limitations are within the WHERE clause.
You only need to specify limitations within the join condition when using outer joins.

To me (and this is personal preference) I prefer to keep the joins and the limitations seperate then I can see clearly what I am using to join tables on and what I am using to limit my data on. It also assists in ensuring that you don't miss out join conditions which can happen when writing larger queries

Vadim Tropashko
Yes, ANSI style outer join is more expressive. Yet, how often do you write outer join queries?
Quite often actually. All depends what data people are having to work with.

I never feel a need writing a full outer join query.
True, I rarely do a full outer join.

So, do you think a minuscule benefit of outer join overweight the horror of NATURAL JOIN clause?
Nobody said that because you use ANSI syntax you have to use NATURAL JOIN clauses.
In fact I would be horrified to find anybody relying on natural joins. I agree they are a horror.

We actually had a situation where we needed to extract data from a third party database and, because of the way the data was being stored it was actually necessary to use ANSI to get the data.
The issue was to do with a table requiring being outer joined to two tables e.g.

SQL> select * from a;
 
        ID
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
 
10 rows selected.
 
SQL> select * from b;
 
        ID
----------
         2
         4
         6
         8
        10
 
SQL> select * from c;
 
        ID        ID2
---------- ----------
         1          1
         2          2
         3          3
         4          4
         5          5
 
SQL> select *
  2  from a,b,c
  3  where a.id = b.id
  4  and a.id = c.id(+)
  5  and b.id = c.id2(+)
  6  ;
and a.id = c.id(+)
         *
ERROR at line 4:
ORA-01417: a table may be outer joined to at most one other table
 
 
SQL> select *
  2  from a JOIN b ON (a.id = b.id)
  3         LEFT OUTER JOIN c ON (a.id = c.id AND b.id = c.id2)
  4  ;
 
        ID         ID         ID        ID2
---------- ---------- ---------- ----------
         2          2          2          2
         4          4          4          4
         6          6
        10         10
         8          8
 
SQL>


P.S.

William Robertson
You don't need the brackets around the ON clause btw.

I know, I just do that from personal preference.

;)
Legend
Guru Guru : 2500 - 1000000 pts
Expert Expert : 1000 - 2499 pts
Pro Pro : 500 - 999 pts
Journeyman Journeyman : 200 - 499 pts
Newbie Newbie : 0 - 199 pts
Oracle ACE Director
Oracle ACE Member
Oracle Employee ACE
Helpful Answer (5 pts)
Correct Answer (10 pts)

Point your RSS reader here for a feed of the latest messages in all forums