Go Back   Rhinocerus > Newsgroup > Newsgroup comp.databases.* > Newsgroup comp.databases.ibm-db2

Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 06-20-2012, 05:01 PM
Oliver Schoett
Guest
 
Posts: n/a
Default NLJOIN cost - am I crazy or DB2?

I used to think the cost of an NLJOIN between an outer and an inner
subquery should be at least

card(outer) * cost(inner),

but the optimizer of DB2/NT64 9.7.5 at standard settings surprises me
with the following:

2.60477e+008
NLJOIN
( 2)
350361
42426.5
/----+-----\
3.86338e+006 67.422
TBSCAN IXSCAN
( 3) ( 6)
35725.5 25.7914
15576.4 2

According to my understanding, the resulting cost(2) (= cost of node 2)
should be at least card(3) * cost(6), which is 3.86338e+006 * 25.7914 =
99.6e6; that is, 100 million instead of the measly 350361 it comes up with.

The actual performance of the plan is as terrible as it looks (I aborted
it after several hours) and confirms that 100 million is closer to the
truth than 350k. Better plans would presumably be available, but do not
get chosen as long as this plan is calculated to be so cheap.

I have seen such strange cost estimates before in the context of more
complex queries, but this is the first time I have seen it in a simple
two-table join.

Can anyone explain what is wrong here?

Regards,

Oliver Schoett
Reply With Quote
Alt Today
Advertising
 
and become member of Rhinocerus
Standard Sponsored Links

  #2 (permalink)  
Old 06-20-2012, 08:15 PM
danfan46
Guest
 
Posts: n/a
Default Re: NLJOIN cost - am I crazy or DB2?

On 06/20/2012 07:01 PM, Oliver Schoett wrote:
> I used to think the cost of an NLJOIN between an outer and an inner subquery should be at least
>
> card(outer) * cost(inner),
>
> but the optimizer of DB2/NT64 9.7.5 at standard settings surprises me with the following:
>
> 2.60477e+008
> NLJOIN
> ( 2)
> 350361
> 42426.5
> /----+-----\
> 3.86338e+006 67.422
> TBSCAN IXSCAN
> ( 3) ( 6)
> 35725.5 25.7914
> 15576.4 2
>
> According to my understanding, the resulting cost(2) (= cost of node 2) should be at least card(3) * cost(6), which is 3.86338e+006 * 25.7914 = 99.6e6; that is, 100 million instead of the measly 350361 it comes up with.
>
> The actual performance of the plan is as terrible as it looks (I aborted it after several hours) and confirms that 100 million is closer to the truth than 350k. Better plans would presumably be available, but do not get chosen as long as this plan is
> calculated to be so cheap.
>
> I have seen such strange cost estimates before in the context of more complex queries, but this is the first time I have seen it in a simple two-table join.
>
> Can anyone explain what is wrong here?
>
> Regards,
>
> Oliver Schoett


If the plan estimate does not agree with reality, I'd first check if statistcs are OK.
/dg
Reply With Quote
  #3 (permalink)  
Old 06-21-2012, 11:29 AM
Oliver Schoett
Guest
 
Posts: n/a
Default Re: NLJOIN cost - am I crazy or DB2?

danfan46 wrote:

> If the plan estimate does not agree with reality, I'd first check if
> statistcs are OK.
> /dg


Good point - I had only inspected the statistics visually. So I created
them afresh with

RUNSTATS ON TABLE ...
on all columns
WITH DISTRIBUTION
AND DETAILED INDEXES ALL
ALLOW WRITE ACCESS

on both the tables involved. The access plan has not changed:

2.60477e+008
NLJOIN
( 2)
350361
42426.5
/----+-----\
3.86338e+006 67.422
TBSCAN IXSCAN
( 3) ( 6)
35725.5 25.7914
15576.4 2

Interestingly, adding "for read only optimize for 1 row" at the end of
the select statement changes not the NLJOIN but its cost estimate:

2.60477e+008
NLJOIN
( 2)
757934
67353.3
/-----+------\
3.86338e+006 67.422
TBSCAN IXSCAN
( 3) ( 4)
63038 25.7914
34066 2

The cost change of node(3) is OK, as a different index is used to obtain
the set of join keys (wider, hence requiring more I/O, but not requiring
a sort, hence delivering the first row more quickly). This should
increase the total cost of the NLJOIN accordingly (by about 30K), as the
result set of node(3) is the same as in the original plan (it is the
result of a full index scan as there are no conditions except the join
conditions in the query).

However, the cost estimate of the NLJOIN is now ~760k instead of the
correct ~100 million, so just as wrong, only different. This does not
make any sense to me whatsoever.

Regards,

Oliver Schoett
Reply With Quote
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




All times are GMT. The time now is 10:35 PM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.