|
本文从不绑定变量与绑定变量两种情况讨论柱状图的作用。
一、不绑定变量的情况:
大家可以考虑下面的数据:
sql> select owner,count(1) from th group by owner;
owner count(1)
------------------------------ ----------
suk 1
sys 36216
system 1
|
其中,在表的owenr上建立有一个索引。
做完普通分析后,再来执行查询。
sql> analyze table th compute statistics;
table analyzed
sql> select * from th where owner='sys';
已选择36216行。
execution plan
-------------------------------------------
0 select statement optimizer=choose (cost=50 card=12073 bytes=
881329)
1 0 table access (full) of 'th' (cost=50 card=12073 bytes=8813
29)
statistics
-------------------------------------------
0 recursive calls
0 db block gets
2894 consistent gets
0 physical reads
0 redo size
2045535 bytes sent via sql*net to client
27057 bytes received via sql*net from client
2416 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
36216 rows processed
sql> select * from th where owner='suk';
execution plan
----------------------------------------------------------
0 select statement optimizer=choose (cost=50 card=12073 bytes=
881329)
1 0 table access (full) of 'th' (cost=50 card=12073 bytes=8813
29)
statistics
--------------------------------------
0 recursive calls
0 db block gets
513 consistent gets
0 physical reads
0 redo size
1133 bytes sent via sql*net to client
503 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
|
只需返回一条数据,但做了全表扫描,这是不合理的执行计划。因为,它只是知道owner列有三个不同的值,但oracle不知道每个不同的owner分别有多少记录,oracle默认为这些数据的分布是完全均匀的,所以,当用owner作条件时,oracle会认为会返回总记录的三分之一(从执行计划中的card=12073可以看出来)
对表th生成柱状图后在做同样的查询:
sql> analyze table th compute statistics for
table for all indexes for all indexed columns;
table analyzed
sql> select * from th where owner='sys';
已选择36216行。
execution plan
----------------------------------------------------------
0 select statement optimizer=choose (cost=50 card=36216 bytes=
2643768)
1 0 table access (full) of 'th' (cost=50 card=36216 bytes=2643
768)
statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2894 consistent gets
0 physical reads
0 redo size
2045535 bytes sent via sql*net to client
27057 bytes received via sql*net from client
2416 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
36216 rows processed
sql> select * from th where owner='suk';
execution plan
----------------------------------------------------------
0 select statement optimizer=choose (cost=2 card=1 bytes=73)
1 0 table access (by index rowid) of 'th' (cost=2 card=1 bytes
=73)
2 1 index (range scan) of 'idx_th' (non-unique) (cost=1 card
=1)
statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
1133 bytes sent via sql*net to client
503 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可见,生成了柱状图后,oracle会根据数据的实际分布情况
选择合适的执行计划,提高性能。
-------------------------------------
|
二、绑定变量的情况下
下面看看在绑定变量的情况下,执行同样的操作,会发生什么事情
sql> analyze table th compute statistics;
表已分析。
sql> var o varchar2(20)sql> exec :o:='sys'
pl/sql 过程已成功完成。
sql> select * from th where owner=:o;
已选择32192行。
execution plan
-----------------------------------------
0 select statement optimizer=choose (cost=75 card=10731 bytes=
783363)
1 0 table access (full) of 'th' (cost=75 card=10731 bytes=7833
63)
statistics
-----------------------------------------
0 recursive calls
0 db block gets
2886 consistent gets
0 physical reads
0 redo size
1818406 bytes sent via sql*net to client
24109 bytes received via sql*net from client
2148 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32192 rows processed
sql> exec :o:='suk'
pl/sql 过程已成功完成。
sql> select * from th where owner=:o;
execution plan
------------------------------------
0 select statement optimizer=choose (cost=75 card=10731 bytes=
783363)
1 0 table access (full) of 'th' (cost=75 card=10731 bytes=7833
63)
statistics
------------------------------------
0 recursive calls
0 db block gets
770 consistent gets
0 physical reads
0 redo size
1151 bytes sent via sql*net to client
503 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
|
从以上测试可以看到,在绑定变量的情况下,如果没有分析柱状图,两个查询都使用了相同的执行计划--全表扫描。这也很容易理解,在第一次解析sql的时候,会根据:o的绑定的值去窥视表数据,因为oracle不知道连接列的数据的具体分布,所以它会以为会返回三分之一的数据,所以选择了全表扫描。在以后的执行同样的sql时会重用该sql,都会使用第一次解析生成的执行计划了。在本例中,由于没有做柱状图,索引第一次执行select * from th where owner=:o时,无论:0是'sys'还是'suk',都会使用全表扫描。那么,我们是否可以得出这样的一个结论:如果对表做了柱状图,那么如果第一次硬解析sql时:o的值是'suk'时,这个sql将会使用索引扫描;如果第一次硬解析时:o的值是'sys'时,sql将会使用全表扫描呢?看如下的测试例子:
sql> alter system flush shared_pool;
系统已更改。
sql> analyze table th delete statistics;
表已分析。
sql> analyze table th compute statistics for table
for all indexes for all indexed columns;
表已分析。
sql> exec :o:='sys'
pl/sql 过程已成功完成。
sql> select * from th where owner=:o;
已选择32192行。
execution plan
---------------------------------------
0 select statement optimizer=choose (cost=75 card=10731 bytes=
890673)
1 0 table access (full) of 'th' (cost=75 card=10731 bytes=8906
73)
statistics
--------------------------------------
271 recursive calls
0 db block gets
2900 consistent gets
0 physical reads
0 redo size
1818406 bytes sent via sql*net to client
24109 bytes received via sql*net from client
2148 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32192 rows processed
sql> alter system flush shared_pool;
系统已更改。
sql> exec :o:='suk'
pl/sql 过程已成功完成。
sql> select * from th where owner=:o;
execution plan
----------------------------------------
0 select statement optimizer=choose (cost=75 card=10731 bytes=
890673)
1 0 table access (full) of 'th' (cost=75 card=10731 bytes=8906
73)
statistics
----------------------------------------
529 recursive calls
0 db block gets
51 consistent gets
0 physical reads
0 redo size
1151 bytes sent via sql*net to client
503 bytes received via sql*net from client
2 sql*net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
1 rows processed
|
从这个结果可以看到,分析了柱状图后,无论:o的值是'sys'还是'suk',第一次执行该sql时,使用的都是全表扫描,这与刚才我们的推论不一致了,如果真是这样的话,使用绑定变量对表做柱状图还有什么意义呢?其实这应该算是oracle的一个bug,在这里autotrace的结果是不对的,我们可以用10046看
第一次执行
select *
from
th where owner=:o
当:o:='sys'时
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ----------
parse 1 0.00 0.01 0 0 0 0
execute 1 0.00 0.00 0 0 0 0
fetch 2148 0.09 0.17 0 2886 0 32192
------- ------ -------- ---------- ----------
total 2150 0.09 0.18 0 2886 0 32192
misses in library cache during parse: 1
optimizer goal: choose
parsing user id: 23
rows row source operation
------- -----------------------------------
32192 table access full th
第一次执行
select *
from
th where owner=:o
当:o:='suk'时
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ----------
parse 1 0.01 0.01 0 31 0 0
execute 1 0.00 0.00 0 0 0 0
fetch 2 0.00 0.00 0 4 0 1
------- ------ -------- ---------- ----------
total 4 0.01 0.01 0 35 0 1
misses in library cache during parse: 1
optimizer goal: choose
parsing user id: 23
rows row source operation
------- ---------------------------------------
1 table access by index rowid th
1 index range scan idx_th (object id 7248)
|
从oracle 9i开始,oracle在对sql第一次硬解析时,会对绑定的变量值进行窥视,从而根据变量值和数据的分布决定sql的执行计划。从以上的例子可以证明这点。
结论:
1、无论是否绑定变量,对数据分布不均的情况下柱状图都是很有效的
2、对数据分布不均匀的情况下,使用绑定变量可能会造成恶果,就算对表做了柱状图也一样
3、使用绑定变量,sql第一次执行决定了以后同样的sql执行的执行计划
4、autotrace的信息不一定准确,必要时要用10046查看需要的信息
|