热门标签 | HotTags
当前位置:  开发笔记 > 数据库 > 正文

关于谓词条件有倾斜性的全表扫描countstopkey的成本估算影响

监控系统发现一个sql语句的相应时间特别长,消耗的IO资源也非常高,但是cost成本却非常低SQLSELECTA.ATTRVALUE,B.TYPENAME2FROMtbcs.GROUP_SUBS_MEMBER_ATTRA,tbcs.RESOURCE_TYPEB3WHEREA.ATTRIDres_type4ANDA.REGION235ANDB.ITE

监控系统发现一个sql语句的相应时间特别长,消耗的IO资源也非常高,但是cost成本却非常低 SQL SELECT A.ATTRVALUE, B.TYPENAME 2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B 3 WHERE A.ATTRID = 'res_type' 4 AND A.REGION = 23 5 AND B.ITE

监控系统发现一个sql语句的相应时间特别长,消耗的IO资源也非常高,但是cost成本却非常低

SQL> SELECT A.ATTRVALUE, B.TYPENAME
2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
3 WHERE A.ATTRID = 'res_type'
4 AND A.REGION = 23
5 AND B.ITEMID = A.ATTRVALUE
6 AND ROWNUM = 1;

Elapsed: 00:03:18.62

Execution Plan
----------------------------------------------------------
Plan hash value: 1650466411

-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 55 | 7 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| IM_RES_TYPE | 1 | 36 | 1 (0)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 4 | 165 | 7 (0)| 00:00:01 | | |
| 4 | PARTITION RANGE SINGLE | | 4 | 76 | 3 (0)| 00:00:01 | 4 | 4 |
|* 5 | TABLE ACCESS FULL | GROUP_SUBS_MEMBER_ATTR | 4 | 76 | 3 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX UNIQUE SCAN | PK_IMRESTYPE | 1 | | 0 (0)| 00:00:01 | | |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
5 - filter("A"."ATTRID"='res_type' AND "A"."ATTRVALUE" IS NOT NULL AND "A"."REGION"=23)
6 - access("RES_TYPE_ID"="A"."ATTRVALUE")

Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
469240 consistent gets
469186 physical reads
0 redo size
619 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

那么问题是为什么cost的成本这么低,但是逻辑读确如此之高。

首先这里简单的分析下这个执行计划为什么会有如此高的逻辑读,先来看下面的查询:

SQL> select count(*)
2 from tbcs.GROUP_SUBS_MEMBER_ATTR A
3 where A.ATTRID = 'res_type'
4 AND A.REGION = 23
5 and "A"."ATTRVALUE" IS NOT NULL
6 ;

COUNT(*)
----------
14

而且这部分数据都在rownum 7千万以上的位置。

SQL> select cn from (select a.*, rownum cn from tbcs.GROUP_SUBS_MEMBER_ATTR A where region=23)b where b. ATTRID = 'res_type' and ATTRVALUE IS NOT NULL

CN
----------
72251394
72253121
72261116
72287094
72292151
72296915
72296922
72304758
72333694
72334266
72334281
72334924
72336096
72336103

14 rows selected.

虽然sql语句加上了rownum=1的限制,但是由于tbcs.GROUP_SUBS_MEMBER_ATTR A中(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的满足这个条件的第一行数据在分区段的较后面的block中(全表扫描也是有顺序的)。 此时这个单分区全表扫描 count stopeky的时候也要遍历这个分区的大部分block才能找到满足(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的第一行数据,然后将这行数据同时与IM_RES_TYPE表做nested loop,当发现这行数据满足”RES_TYPE_ID”=”A”.”ATTRVALUE”即终止查询。 如果不满足再去单分区全扫描找第二行满足谓词条件5的数据行,然后再去和IM_RES_TYPE表比对,一直到找到符合条件的数据行为止。

如果全表扫描前几次IO能够扫描的block刚好能够满足谓词条件,则加上rownum限制条件确实是能够减少大部分的IO消耗。

sys@CRMDB1>select attrid from tbcs.GROUP_SUBS_MEMBER_ATTR A where region=23 and ATTRVALUE IS NOT NULL rownum=1;

ATTRID
--------------------------------
Flag1

1 row selected.

sys@CRMDB1>select * from tbcs.GROUP_SUBS_MEMBER_ATTR A where region=23 and attrid='Flag1' and ATTRVALUE IS NOT NULL and rownum=1;

1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 484799315

------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 41 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE SINGLE| | 2 | 82 | 2 (0)| 00:00:01 | 4 | 4 |
|* 3 | TABLE ACCESS FULL | GROUP_SUBS_MEMBER_ATTR | 2 | 82 | 2 (0)| 00:00:01 | 4 | 4 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
3 - filter("ATTRID"='Flag1' AND "A"."ATTRVALUE" IS NOT NULL AND "REGION"=23)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
968 bytes sent via SQL*Net to client
521 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

由于满足谓词5的数据都在分区段的后面的block中,所以这个sql会消耗特别多的IO资源。

下面我们来解释下为什么在单分区全表扫描然后count stopkey时cbo的评估的cost成本只有3,先来看下表的统计信息:

关于GROUP_SUBS_MEMBER_ATTR表的统计信息:

Table????????????????????????????????? Number??????????????????????? Empty??? Chain Average Global???????? Sample Date
Name????????????????????????????????? of Rows????????? Blocks?????? Blocks??? Count Row Len Stats??????????? Size MM-DD-YYYY
------------------------------ -------------- --------------- ------------ -------- ------- ------ -------------- ----------
GROUP_SUBS_MEMBER_ATTR??????????? 344,752,080?????? 22,395,19??????????? 0??????? 0????? 41 YES??????? 17,237,604 09-10-2014


Column???????????????????????????? Distinct????????????? Number?????? Number???????? Sample Date
Name???????????????????????????????? Values???? Density Buckets??????? Nulls?????????? Size MM-DD-YYYY
------------------------------ ------------ ----------- ------- ------------ -------------- ----------
REGION??????????????????????????????????? 4?? .25000000?????? 1??????????? 0???? 17,237,604 09-10-2014
ATTRID????????????????????????????????? 166?? .00602410?????? 1??????????? 0???? 17,237,604 09-10-2014
ATTRVALUE?????????????????????????? 189,682?? .00000527?????? 1?? 60,824,860???? 14,196,361 09-10-2014


Index????????????????????????????????????? Leaf?????? Distinct???????? Number????? AV????? Av????? Cluster Date
Name?????????????????????????? BLV???????? Blks?????????? Keys??????? of Rows???? LEA??? Data?????? Factor MM-DD-YYYY
------------------------------ --- ------------ -------------- -------------- ------- ------- ------------ ----------
PK_CM_GROUP_MEMBERATTR?????????? 3??? 3,462,820??? 341,230,660??? 341,230,660?????? 1?????? 1? 108,506,400 09-10-2014

index????????????????????????? Column????????????????????????? Col Column
Name?????????????????????????? Name??????????????????????????? Pos Details
------------------------------ ------------------------------ ---- ------------------
PK_CM_GROUP_MEMBERATTR???????? GRPSUBSMEMOID???????????????????? 1 NUMBER(18,0) NOT NULL
?????????????????????????????? ATTRID??????????????????????????? 2 VARCHAR2(32) NOT NULL
?????????????????????????????? REGION??????????????????????????? 3 NUMBER(5,0) NOT NULL

**********************************************************
Partition Level
**********************************************************

Partition????????????? Number??????????????????????? Empty Average??? Chain Average Global Date
Name????????????????? of Rows????????? Blocks?????? Blocks?? Space??? Count Row Len Stats? MM-DD-YYYY
-------------- -------------- --------------- ------------ ------- -------- ------- ------ ----------
P_R_20??????????? 107,562,800??????? 7,002,86??????????? 0?????? 0??????? 0????? 41 YES??? 10-10-2014
P_R_21???????????? 65,051,340??????? 4,220,31??????????? 0?????? 0??????? 0????? 41 YES??? 07-07-2014
P_R_22???????????? 89,764,040??????? 5,816,18??????????? 0?????? 0??????? 0????? 41 YES??? 08-07-2014
P_R_23???????????? 77,962,200??????? 5,069,40??????????? 0?????? 0??????? 0????? 41 YES??? 07-26-2014
P_R_99????????????????????? 0??????????? 0,00??????????? 0?????? 0??????? 0?????? 0 YES??? 09-15-2013

对于attrid的num_distinct是166,num_nulls是0,attrvalue的num_distinct是189682,num_nulls是60824860,而且他们的number buckets都是1,也就是没有直方图,此时cbo要根据(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)谓词条件取出满足条件的数据,由于根据统计信息得出大部分的数据都是满足于这个条件的,cbo会认为只需要全表扫描的前几次IO都能够取出满足条件的数据。(关于选择率和cost成本的计算太过于复杂,不属于本篇blog讨论的范围)

单独分离出来这部分查询cbo估算的执行成本也很低,但是消耗的逻辑读还是特别高。

sys@CRMDB1>select * from tbcs.GROUP_SUBS_MEMBER_ATTR A where rownum=1 and region=23 and ATTRID = 'res_type' AND "A"."ATTRVALUE" IS NOT NULL

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 484799315

------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 41 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE SINGLE| | 2 | 82 | 2 (0)| 00:00:01 | 4 | 4 |
|* 3 | TABLE ACCESS FULL | GROUP_SUBS_MEMBER_ATTR | 2 | 82 | 2 (0)| 00:00:01 | 4 | 4 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
3 - filter("ATTRID"='Flag1' AND "A"."ATTRVALUE" IS NOT NULL AND "REGION"=23)


Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
469266 consistent gets
468242 physical reads
0 redo size
984 bytes sent via SQL*Net to client
521 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

这里由于(“ATTRID”=’Flag1′ AND “A”.”ATTRVALUE” IS NOT NULL AND “REGION”=23)的数据具有较大的倾斜性且没有直方图,导致了cbo全表扫描count stopkey时错误的估算了成本。

下面xiaoyu新建一个表来测试有直方图和没直方图时估算类似rownum=1 and column1=A这类查询的成本区别。

SQL> select owner,count(*) from tab01 group by owner;

OWNER COUNT(*)
------------------------------ ----------
OWBSYS_AUDIT 24
MDSYS 4022
QWE 1
PUBLIC 67990
OUTLN 20
CTXSYS 778
OLAPSYS 1442
FLOWS_FILES 26
OWBSYS 4
SYSTEM 1236
ORACLE_OCM 16
EXFSYS 624
APEX_030200 5122
DBSNMP 114
ORDSYS 5026
ORDPLUGINS 20
SYSMAN 7108
APPQOSSYS 10
XDB 2336
ORDDATA 514
XIAOYU 154
SYS 75434
WMSYS 666

23 rows selected.

SQL> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'TAB01',method_opt=>'for all columns size 1');

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly;
SQL> select * from tab01 where rownum=1 and owner='SYS';


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 98 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
3 - filter("OWNER"='SYS')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
6 consistent gets
45 physical reads
0 redo size
1605 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> select * from tab01 where rownum=1 and owner='XIAOYU';


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 98 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
3 - filter("OWNER"='XIAOYU')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2576 consistent gets
2673 physical reads
0 redo size
1615 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> select * from tab01 where rownum=1 and owner='IMP';

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 98 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
3 - filter("OWNER"='IMP')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3359 consistent gets
3263 physical reads
0 redo size
1343 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

这里看出来在num bucket等于1时,全表扫描然后count stop key成本估算都是2,但是由于数据的分布问题实际的逻辑读是有量变的。

SQL> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'TAB01',method_opt=>'for all columns size auto');

PL/SQL procedure successfully completed.

SQL> select * from tab01 where rownum=1 and owner='SYS';


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 98 | 2 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 2 (0)| 00:00:01 | 1 | 40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
3 - filter("OWNER"='SYS')


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
45 physical reads
0 redo size
1605 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> select * from tab01 where rownum=1 and owner='XIAOYU';


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 98 | 25 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 25 (0)| 00:00:01 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 25 (0)| 00:00:01 | 1 | 40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
3 - filter("OWNER"='XIAOYU')


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2576 consistent gets
2673 physical reads
0 redo size
1615 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> select * from tab01 where rownum=1 and owner='IMP';

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 4034257318

----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 98 | 161 (0)| 00:00:02 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 2 | 196 | 161 (0)| 00:00:02 | 1 | 40 |
|* 3 | TABLE ACCESS FULL | TAB01 | 2 | 196 | 161 (0)| 00:00:02 | 1 | 40 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
3 - filter("OWNER"='IMP')


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3359 consistent gets
3263 physical reads
0 redo size
1343 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

如果收集该列的直方图,此时优化器清楚的知道谓词条件的数据分布,对于这类查询优化器就能够较准确的评估cost的成本。

优化这个sql并不难,有一个思路就是利用小表IM_RES_TYPE去做驱动表驱动大表GROUP_SUBS_MEMBER_ATTR,可以利用attrvalue和attrid的等值条件创建索引来避免对大表的单分区全扫描。

直接创建(ATTRVALUE+ATTRID)的索引。

explain plan for
SELECT A.ATTRVALUE, B.TYPENAME
FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
WHERE A.ATTRID = 'res_type'
AND A.REGION = 23
AND B.ITEMID = A.ATTRVALUE
AND ROWNUM = 1;

Plan hash value: 1650466411

-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 55 | 7 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| IM_RES_TYPE | 1 | 36 | 1 (0)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 4 | 165 | 7 (0)| 00:00:01 | | |
| 4 | PARTITION RANGE SINGLE | | 4 | 76 | 3 (0)| 00:00:01 | 4 | 4 |
|* 5 | TABLE ACCESS FULL | GROUP_SUBS_MEMBER_ATTR | 4 | 76 | 3 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX UNIQUE SCAN | PK_IMRESTYPE | 1 | | 0 (0)| 00:00:01 | | |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
5 - filter("A"."ATTRID"='res_type' AND "A"."ATTRVALUE" IS NOT NULL AND "A"."REGION"=23)
6 - access("RES_TYPE_ID"="A"."ATTRVALUE")

这里默认情况下cbo根据cost成本估算并不会选择先用tbcs.RESOURCE_TYPE B去做驱动表做nested loop关联,这是因为两个sql的执行成本太接近,优化器既有可能选择全表扫描count stopkey,也有可能选择索引扫描count stopkey,但是消耗的IO资源是存在量变的。

SQL> SELECT /*+leading(B A)*/A.ATTRVALUE, B.TYPENAME
2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
3 WHERE A.ATTRID = 'res_type'
4 AND A.REGION = 23
5 AND B.ITEMID = A.ATTRVALUE
6 AND ROWNUM = 1;

Elapsed: 00:00:00.06

Execution Plan
----------------------------------------------------------
Plan hash value: 458037665

-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 55 | 14 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
|* 2 | TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR | 1 | 19 | 12 (0)| 00:00:01 | 4 | 4 |
| 3 | NESTED LOOPS | | 10 | 55 | 14 (0)| 00:00:01 | | |
| 4 | TABLE ACCESS FULL | IM_RES_TYPE | 1 | 36 | 2 (0)| 00:00:01 | | |
| 5 | PARTITION RANGE SINGLE | | 10 | | 3 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX RANGE SCAN | IND_ATTRVALUE_ATTRID | 10 | | 3 (0)| 00:00:01 | 4 | 4 |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
2 - filter("A"."REGION"=23)
6 - access("RES_TYPE_ID"="A"."ATTRVALUE" AND "A"."ATTRID"='res_type')
filter("A"."ATTRVALUE" IS NOT NULL)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
370 consistent gets
0 physical reads
0 redo size
619 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

这里需要强制加上leading(B A)让优化器先以tbcs.RESOURCE_TYPE B去做驱动表,然后驱动GROUP_SUBS_MEMBER_ATTR A表。

我们这里还可以有个想法就是,既然原sql消耗的成本主要在全表扫描取(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)的数据,那么我们能否用另外的一个取数据的办法就是通过index range scan的方式了,那我们能否让cbo走(ATTRVALUE+ATTRID)的索引来取数据。

SQL> SELECT /*+index(A ind_attrvalue_attrid)*/A.ATTRVALUE, B.TYPENAME
2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
3 WHERE A.ATTRID = 'res_type'
4 AND A.REGION = 23
5 AND B.ITEMID = A.ATTRVALUE
6 AND ROWNUM = 1;

Elapsed: 00:02:21.41

Execution Plan
----------------------------------------------------------
Plan hash value: 485372855

--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 55 | 11 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | IM_RES_TYPE | 1 | 36 | 1 (0)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 4 | 165 | 11 (0)| 00:00:01 | | |
| 4 | PARTITION RANGE SINGLE | | 4 | 76 | 7 (0)| 00:00:01 | 4 | 4 |
|* 5 | TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR | 4 | 76 | 7 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX FULL SCAN | IND_ATTRVALUE_ATTRID | 98189 | | 6 (0)| 00:00:01 | 4 | 4 |
|* 7 | INDEX UNIQUE SCAN | PK_IMRESTYPE | 1 | | 0 (0)| 00:00:01 | | |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
5 - filter("A"."REGION"=23)
6 - access("A"."ATTRID"='res_type')
filter("A"."ATTRID"='res_type' AND "A"."ATTRVALUE" IS NOT NULL)
7 - access("RES_TYPE_ID"="A"."ATTRVALUE")


Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
227934 consistent gets
227981 physical reads
0 redo size
619 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

cob采取比较糟糕的index full scan然会回表的方式,并没有采取我们所认为的index range scan的方式来回表,那我们想想为什么优化器不能够采用index range scan的方式了,这里是因为我们创建索引是(ATTRVALUE+ATTRID)的组合索引,在这个索引取数据时由于”A”.”ATTRVALUE” IS NOT NULL并不是一个等值的条件,而这个attrvalue又是前导列,这个导致优化器如果想走index range scan需要走类似的index range scan然后 INLIST ITERATOR迭代的方式,由于attrvalue有很大一部分的null value,这将会导致这部分执行cbo估算时较高,而不选择这种执行计划,进而选择了更糟糕的index full scan回表的方式,由于attrid=’res_type’ 具有很大的倾斜性,刚好这部分数据又在索引的后面的几个leaf block中。

再来想想(ATTRID+ATTRVALUE)复合索引,通过attrid是索引前导列,(“A”.”ATTRID”=’res_type’ AND “A”.”ATTRVALUE” IS NOT NULL AND “A”.”REGION”=23)这个谓词条件利用attrid是索引前导列的复合索引可以很精准的从root到branch再到leaf block,从而在leaf block时index range scan。

SQL> SELECT A.ATTRVALUE, B.TYPENAME
2 FROM tbcs.GROUP_SUBS_MEMBER_ATTR A, tbcs.RESOURCE_TYPE B
3 WHERE A.ATTRID = 'res_type'
4 AND A.REGION = 23
5 AND B.ITEMID = A.ATTRVALUE
6 AND ROWNUM = 1;

Elapsed: 00:00:00.12

Execution Plan
----------------------------------------------------------
Plan hash value: 2801988880

--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 55 | 9 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | IM_RES_TYPE | 1 | 36 | 1 (0)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 4 | 165 | 9 (0)| 00:00:01 | | |
| 4 | PARTITION RANGE SINGLE | | 4 | 76 | 5 (0)| 00:00:01 | 4 | 4 |
|* 5 | TABLE ACCESS BY LOCAL INDEX ROWID| GROUP_SUBS_MEMBER_ATTR | 4 | 76 | 5 (0)| 00:00:01 | 4 | 4 |
|* 6 | INDEX RANGE SCAN | IND_ATTRID_ATTRVALUE | 98189 | | 4 (0)| 00:00:01 | 4 | 4 |
|* 7 | INDEX UNIQUE SCAN | PK_IMRESTYPE | 1 | | 0 (0)| 00:00:01 | | |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM=1)
5 - filter("A"."REGION"=23)
6 - access("A"."ATTRID"='res_type')
filter("A"."ATTRVALUE" IS NOT NULL)
7 - access("RES_TYPE_ID"="A"."ATTRVALUE")


Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
10 consistent gets
5 physical reads
0 redo size
619 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

这里我们需要注意的是在表中添加了rownum的限制又添加了部分谓词的条件时,刚好满足这个谓词条件对应的数据块在表段较后面的数据块中,且对应的列没有直方图又存在数据倾斜,此时cbo估算这个全表扫描 count stopkey成本往往是不准确的,当然上面这个例子确实比较特殊,没有直方图且数据有倾斜性,这里借助这个sql case分析只是为了让我们更好的理解nested loop的原理、以及rownum对于表扫描索引扫描时执行计划和成本估算的影响。

推荐阅读
  • 基于PgpoolII的PostgreSQL集群安装与配置教程
    本文介绍了基于PgpoolII的PostgreSQL集群的安装与配置教程。Pgpool-II是一个位于PostgreSQL服务器和PostgreSQL数据库客户端之间的中间件,提供了连接池、复制、负载均衡、缓存、看门狗、限制链接等功能,可以用于搭建高可用的PostgreSQL集群。文章详细介绍了通过yum安装Pgpool-II的步骤,并提供了相关的官方参考地址。 ... [详细]
  • 一、Hadoop来历Hadoop的思想来源于Google在做搜索引擎的时候出现一个很大的问题就是这么多网页我如何才能以最快的速度来搜索到,由于这个问题Google发明 ... [详细]
  • Oracle Database 10g许可授予信息及高级功能详解
    本文介绍了Oracle Database 10g许可授予信息及其中的高级功能,包括数据库优化数据包、SQL访问指导、SQL优化指导、SQL优化集和重组对象。同时提供了详细说明,指导用户在Oracle Database 10g中如何使用这些功能。 ... [详细]
  • 本文介绍了adg架构设置在企业数据治理中的应用。随着信息技术的发展,企业IT系统的快速发展使得数据成为企业业务增长的新动力,但同时也带来了数据冗余、数据难发现、效率低下、资源消耗等问题。本文讨论了企业面临的几类尖锐问题,并提出了解决方案,包括确保库表结构与系统测试版本一致、避免数据冗余、快速定位问题等。此外,本文还探讨了adg架构在大版本升级、上云服务和微服务治理方面的应用。通过本文的介绍,读者可以了解到adg架构设置的重要性及其在企业数据治理中的应用。 ... [详细]
  • 在说Hibernate映射前,我们先来了解下对象关系映射ORM。ORM的实现思想就是将关系数据库中表的数据映射成对象,以对象的形式展现。这样开发人员就可以把对数据库的操作转化为对 ... [详细]
  • 本文介绍了使用postman进行接口测试的方法,以测试用户管理模块为例。首先需要下载并安装postman,然后创建基本的请求并填写用户名密码进行登录测试。接下来可以进行用户查询和新增的测试。在新增时,可以进行异常测试,包括用户名超长和输入特殊字符的情况。通过测试发现后台没有对参数长度和特殊字符进行检查和过滤。 ... [详细]
  • 本文详细介绍了MysqlDump和mysqldump进行全库备份的相关知识,包括备份命令的使用方法、my.cnf配置文件的设置、binlog日志的位置指定、增量恢复的方式以及适用于innodb引擎和myisam引擎的备份方法。对于需要进行数据库备份的用户来说,本文提供了一些有价值的参考内容。 ... [详细]
  • 使用Ubuntu中的Python获取浏览器历史记录原文: ... [详细]
  • 本文由编程笔记小编整理,介绍了PHP中的MySQL函数库及其常用函数,包括mysql_connect、mysql_error、mysql_select_db、mysql_query、mysql_affected_row、mysql_close等。希望对读者有一定的参考价值。 ... [详细]
  • 本文介绍了Oracle数据库中tnsnames.ora文件的作用和配置方法。tnsnames.ora文件在数据库启动过程中会被读取,用于解析LOCAL_LISTENER,并且与侦听无关。文章还提供了配置LOCAL_LISTENER和1522端口的示例,并展示了listener.ora文件的内容。 ... [详细]
  • Spring特性实现接口多类的动态调用详解
    本文详细介绍了如何使用Spring特性实现接口多类的动态调用。通过对Spring IoC容器的基础类BeanFactory和ApplicationContext的介绍,以及getBeansOfType方法的应用,解决了在实际工作中遇到的接口及多个实现类的问题。同时,文章还提到了SPI使用的不便之处,并介绍了借助ApplicationContext实现需求的方法。阅读本文,你将了解到Spring特性的实现原理和实际应用方式。 ... [详细]
  • Java String与StringBuffer的区别及其应用场景
    本文主要介绍了Java中String和StringBuffer的区别,String是不可变的,而StringBuffer是可变的。StringBuffer在进行字符串处理时不生成新的对象,内存使用上要优于String类。因此,在需要频繁对字符串进行修改的情况下,使用StringBuffer更加适合。同时,文章还介绍了String和StringBuffer的应用场景。 ... [详细]
  • Oracle分析函数first_value()和last_value()的用法及原理
    本文介绍了Oracle分析函数first_value()和last_value()的用法和原理,以及在查询销售记录日期和部门中的应用。通过示例和解释,详细说明了first_value()和last_value()的功能和不同之处。同时,对于last_value()的结果出现不一样的情况进行了解释,并提供了理解last_value()默认统计范围的方法。该文对于使用Oracle分析函数的开发人员和数据库管理员具有参考价值。 ... [详细]
  • MyBatis错题分析解析及注意事项
    本文对MyBatis的错题进行了分析和解析,同时介绍了使用MyBatis时需要注意的一些事项,如resultMap的使用、SqlSession和SqlSessionFactory的获取方式、动态SQL中的else元素和when元素的使用、resource属性和url属性的配置方式、typeAliases的使用方法等。同时还指出了在属性名与查询字段名不一致时需要使用resultMap进行结果映射,而不能使用resultType。 ... [详细]
  • 本文详细介绍了在ASP.NET中获取插入记录的ID的几种方法,包括使用SCOPE_IDENTITY()和IDENT_CURRENT()函数,以及通过ExecuteReader方法执行SQL语句获取ID的步骤。同时,还提供了使用这些方法的示例代码和注意事项。对于需要获取表中最后一个插入操作所产生的ID或马上使用刚插入的新记录ID的开发者来说,本文提供了一些有用的技巧和建议。 ... [详细]
author-avatar
早安丶晚安丶英语连_691
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有