Thursday 17 December 2009

Miracle Open World 2010, April 14-16, Legoland

Finally Miracle has published the first official advertisement for Miracle Open World 2010 event, which will take place in Legoland, Billund, Jutland, Denmark, from 14-16 April 2010. Here is the link to the event site.

I'll be presenting at this spectacular and unique event. Hopefully the weather will permit me to fly to Legoland by myself and land at Billund airport which is just several hundred meters away from the conference venue.

Wednesday 16 December 2009

Force Cursor Invalidation

Many times it occurs that an inappropriate execution plan is used which was produced by using the current values of bind variables provided at the time of the hard parse. But later on the variables change so much that another execution plan would be required. Unfortunately there is no automatism in 9i and 10g that would spot this fact. Oracle finally resolved this problem in 11g.

The trick is to virtually set the statistics for the object which is involved in the query. What I mean by virtually is that I read the current statistics and store the same statistics back what makes no harm but the side effect is that the cursor is invalidated and hence it will be re-parsed and hopefully this time optimized for the right values of bind variables.

Here is the code:

CREATE OR REPLACE PROCEDURE Invalidate_statistics (
p_ownname VARCHAR2,
p_tabname VARCHAR2
) IS
m_srec DBMS_STATS.STATREC;
m_distcnt NUMBER;
m_density NUMBER;
m_nullcnt NUMBER;
m_avgclen NUMBER;
m_colname VARCHAR2(30);
--
-- This procedure invalidates all cursors present in v$sql
-- which refer to the table passed as input parameter
--
-- requires DBA privileges
--
-- This is a workaround to solve the problem of bind variable peeking
--
BEGIN
-- first find column definition for the first column - doesn't matter for which
-- column we reset the statistics
BEGIN
SELECT column_name INTO m_colname
FROM DBA_TAB_COLUMNS
WHERE owner = p_ownname
AND table_name = p_tabname
AND ROWNUM = 1;
EXCEPTION
WHEN NO_DATA_FOUND THEN RAISE_APPLICATION_ERROR( -20000,
'Table '||p_ownname||'.'||p_tabname||' does not exist');
END;
-- virtually change the statistics - replace with the same values what will cause
-- invalidation of all cursors which refer to this table - NO_INVALIDATE must be FALSE
DBMS_STATS.GET_COLUMN_STATS (ownname => p_ownname,
tabname => p_tabname,
colname => m_colname,
distcnt => m_distcnt,
density => m_density,
nullcnt => m_nullcnt,
srec => m_srec,
avgclen => m_avgclen
);
DBMS_STATS.SET_COLUMN_STATS (ownname => p_ownname,
tabname => p_tabname,
colname => m_colname,
distcnt => m_distcnt,
density => m_density,
nullcnt => m_nullcnt,
srec => m_srec,
avgclen => m_avgclen,
no_invalidate => FALSE
);
END;
/

The procedure reads and stores back the statistics for the first column of the table.

SQL> exec Invalidate_statistics('SCOTT','EMP')

PL/SQL procedure successfully completed.

Sometimes it requires to re-execute the invalidation as the cursor is currently executing. One can use this in 9i,10g,11g.

Thursday 10 December 2009

"Optimizer Internals" - Celebrity Seminar in Netherlands

On June 15th 2010 I will be speaking about internals of the Cost Based Optimizer in Netherlands. This is a one day seminar organized by Oracle University which goes in details in some sections of the cost based optimization. The topics are related to hot themes for which people would like to get a good explanation. Here is a detailed table of contents.

This seminar is one of four that I have contributed to the series of celebrity seminars organized by Oracle University.

Tuesday 8 December 2009

CBO oddities in determining selectivity on NVARCHAR2 data type

Some time ago I was involved in solving of a performance problem which was really weird - a simple select on a indexed column with unique value always resulted in a full table scan. The key column was of type NVARCHAR2. To better understand the behavior I created a simple reproducible test case.

Let us start with table and index creation.

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE 11.1.0.6.0 Production
TNS for 32-bit Windows: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

SQL> show parameter semant

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
nls_length_semantics string CHAR

SQL> create table x2 (id varchar2(32), idn nvarchar2(32), id16 nvarchar2(16), id20 nvarchar2(20));

Table created.

SQL> insert into x2 select lpad(rownum,32,'0'), lpad(rownum,32,'0'),lpad(rownum,16,'0'),lpad(rownum,20,'0')
from dual connect by level <= 20000; 20000 rows created. SQL> commit;

Commit complete.

SQL> create index x2_idn on x2(idn);

Index created.

SQL> analyze table x2 compute statistics;

Table analyzed.

SQL> explain plan for select * from x2 where idn=U'00000000000000000000000000000009';

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 156302112

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20000 | 3281K| 136 (0)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| X2 | 20000 | 3281K| 136 (0)| 00:00:02 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("IDN"=U'00000000000000000000000000000009')


We have 20,000 rows with unique values in IDN column in the table, we have statistics and still the CBO produces an execution plan with FULL TABLE SCAN and with a very strange estimate of 20,000 rows reurned by this query.
Therefore our first question should be: What is wrong with the statistics? A simple query on USER_TAB_COLS shows the following results:

SQL> select column_name,num_distinct, low_value,high_value from user_tab_cols where table_name='X2';

COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE
------------------------------ ------------ ---------------------------------------------------------------- ----------------------------------------------------------------
ID 20000 3030303030303030303030303030303030303030303030303030303030303031 3030303030303030303030303030303030303030303030303030303230303030
IDN 1 0030003000300030003000300030003000300030003000300030003000300030 0030003000300030003000300030003000300030003000300030003000300030
ID16 20000 0030003000300030003000300030003000300030003000300030003000300031 0030003000300030003000300030003000300030003000320030003000300030
ID20 3 0030003000300030003000300030003000300030003000300030003000300030 0030003000300030003000300030003000300030003000300030003000300032

To display raw values I used function developed by Greg Rahn which gives the following result:

SQL> column lo_val format a32
SQL> column hi_val format a32
SQL> select column_name,
2 num_distinct,
3 display_raw(low_value,data_type) lo_val,
4 display_raw(high_value,data_type) hi_val
5 from user_tab_cols
6 where table_name='X2';

COLUMN_NAME NUM_DISTINCT LO_VAL HI_VAL
------------------------------ ------------ -------------------------------- --------------------------------
ID 20000 00000000000000000000000000000001 00000000000000000000000000020000
IDN 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ID16 20000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0
ID20 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2

We can see that the NUM_DISTINCT column has quite strange values. Reason for this is that Oracle analyzes only first 32 bytes (and not characters) of the string. The default character set for NVARCHAR2 type is AL16UTF16 - a 2 byte character set.
The column IDN was defined as NVARCHAR2(32) and was actually completely filled so every row stores 64 bytes in this column. Here is the proof:


SQL> select dump(idn) from x2 where rownum = 1;

DUMP(IDN)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Typ=1 Len=64: 0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,48,0,56,0,51

Due to considering only first 32 bytes analyze comes back with a strange result that there is only one distinct value, what is of course true, as all distinct digits are beyond first 32 bytes.

Let us continue and use DBMS_STATS.GATHER_TABLE_STATS for analysis.

SQL> begin
2 dbms_stats.gather_table_stats(
3 ownname=> user ,
4 tabname=> 'X2',
5 cascade=> TRUE,
6 estimate_percent=> DBMS_STATS.AUTO_SAMPLE_SIZE,
7 degree=> DBMS_STATS.AUTO_DEGREE,
8 no_invalidate=> FALSE,
9 granularity=> 'ALL',
10 method_opt=> 'FOR ALL COLUMNS SIZE AUTO');
11 end;
12 /

SQL> select column_name,num_distinct,
2 display_raw(low_value,data_type) lo_val,
3 display_raw(high_value,data_type) hi_val
4 from user_tab_cols where table_name='X2'
5 /

COLUMN_NAME NUM_DISTINCT LO_VAL HI_VAL
------------------------------ ------------ -------------------------------- --------------------------------
ID 19820 00000000000000000000000000000001 00000000000000000000000000020000
IDN 20000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ID16 20000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0
ID20 19942 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2

DBMS_STATS produces much more reliable cardinalities but the low/high values are still showing only 0 for IDN column. Here there is no difference between ANALYZE and DBMS_STATS as it can't be. Oracle looks only at first 32 bytes and that is the fact. Therefore the CBO will be not able to properly determine the cardinality for the range predicates. We will look closer this in the continuation of this post.
This fact might become dangerous also for all databases which use multi-byte character set as default character set. Actually Oracle will always look only at first 32 bytes and in case that the column will contain many many multi-byte characters this can become pretty odd. All this is something new and becomes more critical in now days when a lot of databases are migrated from single-byte character set to a multi-byte character sets. For sure migrating to any flavor of UTF-8/AL32UTF8 character set represent such a thread.

Conclusions about statistics for NVARCHAR2 data type
When I realized what is the reason behind for such behavior I suggested to my customer to switch back to rule based optimizer what solved the problem. The other possibility would be to change all NVARCHAR2 columns to VARCHAR2 but the customer accept this.
First, but the most important conclusion would be not to use NVARCHAR2 data type for key columns, but rather VARCHAR2. Or paraphrasing Mogens Norgard: "We don't use NVARCHAR2 data type!" :-)

*Range cardinalities on columns of NVARCHAR2 data type*
Reason for this part of post is my involvement in one thread on OTN where a lady from Austria, my neighbor country, was complaining about the "strange" execution plans and consequently also "un-wanted" performance. In one mail she explained me that they are experiencing performance problems since they have upgraded from Oracle9i to 10g. She also said that at the same time they moved from VARCHAR2 data type to NVARCHAR2 datatype.
My first impression was that this might be the same situation as I have already explained above. But after digging deeply in the problem, I realized that the problem lies in completely mis-estimated selectivity and consequently also the cardinality when NVARCHAR2 data type column is in question.

To be able to study the case I prepared an extremely simple reproducible test case. Let us start with the setup of the test environment:


SQL> create table x1 (cv varchar2(10), cn nvarchar2(10));

Table created.

SQL> insert into x1 select lpad(rownum,8,'0'), lpad(rownum,8,'0')
from dual connect by level <= 20000; 20000 rows created. SQL> commit;

Commit complete.

SQL> select cv,cn from x1 where rownum <= 10; CV CN ---------- ---------- 00000469 00000469 00000470 00000470 00000471 00000471 00000472 00000472 00000473 00000473 00000474 00000474 00000475 00000475 00000476 00000476 00000477 00000477 00000478 00000478 10 rows selected. SQL> begin
2 dbms_stats.gather_table_stats(
3 ownname=> user ,
4 tabname=> 'X1',
5 cascade=> TRUE,
6 estimate_percent=> DBMS_STATS.AUTO_SAMPLE_SIZE,
7 degree=> DBMS_STATS.AUTO_DEGREE,
8 no_invalidate=> FALSE,
9 granularity=> 'ALL',
10 method_opt=> 'FOR ALL COLUMNS SIZE 1');
11* end;
SQL> /

PL/SQL procedure successfully completed.


Let us run our first test SQL. I have used bind variables named :a1, :a2 of type VARCHAR2 and :n1, :n2 of type NVARCHAR2. I have event written a simple anonymous PL/SQL block where I selected the values for bind variables from the same table in order to be sure that the binding is done properly. While I was running the tests I enabled events 10053 and 10046.


SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
SQL_ID 1fj17ram77n5w, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ /* run 20 */ *
from x1
where cn >= to_nchar(:a1)
and cn <= to_nchar(:a2) Plan hash value: 2189453339 ------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------- |* 1 | FILTER | | 1 | | 9999 |00:00:00.08 | 102 | |* 2 | TABLE ACCESS FULL| X1 | 1 | 1 | 9999 |00:00:00.03 | 102 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(SYS_OP_C2C(:A1)<=SYS_OP_C2C(:A2)) 2 - filter(("CN"<=SYS_OP_C2C(:A2) AND "CN">=SYS_OP_C2C(:A1)))


PARSING IN CURSOR #12 len=111 dep=0 uid=88 oct=3 lid=88 tim=254514943459 hv=2792607932 ad='23f7056c' sqlid='1fj17ram77n5w'
select /*+ gather_plan_statistics */ /* run 20 */ *
from x1
where cn >= to_nchar(:a1)
and cn <= to_nchar(:a2) END OF STMT PARSE #12:c=0,e=734,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=254514943452 BINDS #12: Bind#0 oacdty=01 mxl=32(10) mxlc=10 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=01 csi=170 siz=64 off=0 kxsbbbfp=0fec0abc bln=32 avl=08 flg=05 value="00000000" Bind#1 oacdty=01 mxl=32(10) mxlc=10 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=01 csi=170 siz=0 off=32 kxsbbbfp=0fec0adc bln=32 avl=08 flg=01 value="00009999"

I got a hard parse (mis=1), optimizer peeked at the values of bind variables and the estimated cardinality was 1 while the actual one was 9999. From 10046 trace we can clearly see the values of bind variables used by optimizer for the optimization. I used VARCHAR2 type of columns as bind variables but used TO_NCHAR function to convert to NVARCHAR2 data type. The optimizer internally rewrote the SQL so that instead of TO_NCHAR used internal function SYS_OP_C2C which performs conversion from one character set to another.


SQL> select SYS_OP_C2C('Š') from dual;

S
-
S


For next run I used NVARCHAR2 type bind variables and here is the result:

SQL_ID d1bn5vyb6dcgn, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ /* run 15 */ *
from x1
where cn >= to_nchar(:n1)
and cn <= to_nchar(:n2) Plan hash value: 2189453339 ------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------- |* 1 | FILTER | | 1 | | 9999 |00:00:00.08 | 759 | |* 2 | TABLE ACCESS FULL| X1 | 1 | 50 | 9999 |00:00:00.04 | 759 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(:N1<=:N2) 2 - filter(("CN">=:N1 AND "CN"<=:N2)) PARSING IN CURSOR #15 len=111 dep=0 uid=88 oct=3 lid=88 tim=254434564390 hv=2523312628 ad='2c8d2200' sqlid='d1bn5vyb6dcgn' select /*+ gather_plan_statistics */ /* run 15 */ * from x1 where cn >= to_nchar(:n1)
and cn <= to_nchar(:n2) END OF STMT PARSE #15:c=0,e=746,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=254434564383 BINDS #15: Bind#0 oacdty=01 mxl=128(40) mxlc=20 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=02 csi=2000 siz=256 off=0 kxsbbbfp=0febbee0 bln=128 avl=16 flg=05 value="" Bind#1 oacdty=01 mxl=128(40) mxlc=20 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=02 csi=2000 siz=0 off=128 kxsbbbfp=0febbf60 bln=128 avl=16 flg=01 value=""

Here we got the estimated number of rows 50 what is exactly the default use by the CBO for ranges - selectivity is 0.0025. So the CBO simply decided to throw in the default selectivity. The values of bind variables are empty ("") - CBO just ignored them during the optimization phase but they were of course respected by the runtime engine.


SQL_ID cx88wv0q05tst, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ /* run 14 */ *
from x1
where substr(cn,1,10) >= to_nchar(:a1)
and substr(cn,1,10) <= to_nchar(:a2) Plan hash value: 2189453339 ------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------- |* 1 | FILTER | | 1 | | 9999 |00:00:00.09 | 759 | |* 2 | TABLE ACCESS FULL| X1 | 1 | 50 | 9999 |00:00:00.05 | 759 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(SYS_OP_C2C(:A1)<=SYS_OP_C2C(:A2)) 2 - filter((SUBSTR("CN",1,10)>=SYS_OP_C2C(:A1) AND
SUBSTR("CN",1,10)<=SYS_OP_C2C(:A2))) PARSING IN CURSOR #12 len=137 dep=0 uid=88 oct=3 lid=88 tim=254227180414 hv=2415463600 ad='23f7e900' sqlid='f5wapff7zk35h' select /*+ gather_plan_statistics */ /* run 16 */ * from x1 where substr(cn,1,10) >= to_nchar(:a1)
and substr(cn,1,10) <= to_nchar(:a2) END OF STMT PARSE #12:c=0,e=767,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=254227180407 BINDS #12: Bind#0 oacdty=01 mxl=32(10) mxlc=10 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=01 csi=170 siz=64 off=0 kxsbbbfp=0ff75ac0 bln=32 avl=08 flg=05 value="00000000" Bind#1 oacdty=01 mxl=32(10) mxlc=10 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=01 csi=170 siz=0 off=32 kxsbbbfp=0ff75ae0 bln=32 avl=08 flg=01 value="00009999"

In this case I added used SUBSTR function on the column and VARCHAR2 type bind variables and again the CBO threw in the default for range 0.0025, but we can see the values of bind variables.


SQL_ID auyacfyb6bj9r, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ /* run 14 */ *
from x1
where substr(cn,1,10) >= to_nchar(:n1)
and substr(cn,1,10) <= to_nchar(:n2) Plan hash value: 2189453339 ------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------- |* 1 | FILTER | | 1 | | 9999 |00:00:00.09 | 759 | |* 2 | TABLE ACCESS FULL| X1 | 1 | 50 | 9999 |00:00:00.05 | 759 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(:N1<=:N2) 2 - filter((SUBSTR("CN",1,10)>=:N1 AND SUBSTR("CN",1,10)<=:N2)) PARSING IN CURSOR #16 len=137 dep=0 uid=88 oct=3 lid=88 tim=254203929826 hv=1230940210 ad='23f78074' sqlid='13u552p4pxa1k' select /*+ gather_plan_statistics */ /* run 15 */ * from x1 where substr(cn,1,10) >= to_nchar(:n1)
and substr(cn,1,10) <= to_nchar(:n2) END OF STMT PARSE #16:c=0,e=770,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=254203929819 BINDS #16: Bind#0 oacdty=01 mxl=128(40) mxlc=20 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=02 csi=2000 siz=256 off=0 kxsbbbfp=10460bd0 bln=128 avl=16 flg=05 value="" Bind#1 oacdty=01 mxl=128(40) mxlc=20 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=02 csi=2000 siz=0 off=128 kxsbbbfp=10460c50 bln=128 avl=16 flg=01 value=""

In the CBO trace the inforamation was:

*******************************************
Peeked values of the binds in SQL statement
*******************************************
----- Bind Info (kkscoacd) -----
Bind#0
oacdty=01 mxl=32(20) mxlc=10 mal=00 scl=00 pre=00
oacflg=03 fl2=1000010 frm=02 csi=2000 siz=64 off=0
kxsbbbfp=0f1f5a1c bln=32 avl=16 flg=05
value=""
Bind#1
oacdty=01 mxl=32(20) mxlc=10 mal=00 scl=00 pre=00
oacflg=03 fl2=1000010 frm=02 csi=2000 siz=0 off=32
kxsbbbfp=0f1f5a3c bln=32 avl=16 flg=01
value=""

In the last case I just changed from VARCHAR2 to NVARCHAR2 bind variables and the result was expected - the default selectivity and the values of bind variables were again "".

Every time, when the CBO used binds for estimating selectivity I could see the following section after the execution plan:

Peeked Binds
============
Bind variable information
position=1
datatype(code)=1
datatype(string)=VARCHAR2(32)
char set id=170
char format=1
max length=32
value=00000000
Bind variable information
position=2
datatype(code)=1
datatype(string)=VARCHAR2(32)
char set id=170
char format=1
max length=32
value=00009999
Outline Data:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.1.0.6')
DB_VERSION('11.1.0.6')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "X1"@"SEL$1")
END_OUTLINE_DATA
*/

And finally one case with VARCHAR2 data type.

SQL_ID bznnx6zk4hm11, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ /* run 22 */ *
from x1 where cv >=:a1
and cv <= :a2 Plan hash value: 2189453339 ------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------- |* 1 | FILTER | | 1 | | 9999 |00:00:00.07 | 102 | |* 2 | TABLE ACCESS FULL| X1 | 1 | 9999 | 9999 |00:00:00.02 | 102 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(:A1<=:A2) 2 - filter(("CV"<=:A2 AND "CV">=:A1))


PARSING IN CURSOR #13 len=89 dep=0 uid=88 oct=3 lid=88 tim=265619391820 hv=3829943329 ad='2e0c3568' sqlid='bznnx6zk4hm11'
select /*+ gather_plan_statistics */ /* run 22 */ *
from x1
where cv >= :a1
and cv <= :a2 END OF STMT PARSE #13:c=0,e=762,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=265619391813 BINDS #13: Bind#0 oacdty=01 mxl=32(10) mxlc=10 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=01 csi=170 siz=64 off=0 kxsbbbfp=0fec0abc bln=32 avl=08 flg=05 value="00000000" Bind#1 oacdty=01 mxl=32(10) mxlc=10 mal=00 scl=00 pre=00 oacflg=03 fl2=1000010 frm=01 csi=170 siz=0 off=32 kxsbbbfp=0fec0adc bln=32 avl=08 flg=01 value="00009999"

In this last case I was querying the VARCHAR2 column and using bind variables of VARCHAR2 type and this time the optimizer estimated cardinality correctly. Of course it performed peeking at the values of bind variables.

Randolf Geist responded to my post on OTN and he suggested to confuse the CBO by using values like 'A0000000' to 'A00009999'. According to his observations also the selectivity for range predicates on VARCHAR2 data types is quite off after using this trick. Building a histogram returns things back to normal on both data types.
He pointed out that the CBO is obviously so smart to spot stored digits in VARCHAR2 columns and do good estimations by treating them as numbers while this is not true for the VARCHAR2 data type.

CONCLUSION
It is obvious that the CBO is not capable to perforem same estimates for ranges on NVARCHAR2 column as it is capable to do on VARCHAR2 column when numbers are stored as strings.
But the most important conclusion to be made is that one should use *appropriate* data types.

Of course that are many possibilities that the CBO might behave differently (like using DYNAMIC_SAMPLING, ...). So I am looking forward for the contributions of others, not just Randolf Geist.

This case was used to raise SR #3-1193039441: CBO and nvarchar2 --> A new Bug: 9220704 has been filed with development.

Monday 7 December 2009

What is the purpose of SYS_OP_C2C internal function

Recently I was involved in one problem with CBO's selectivity estimations on NVARCHAR2 data type column. What I spotted in predicate information was the usage of internal Oracle function SYS_OP_C2C.

Here is an example of the run-time execution plan using bind variables:


SQL_ID 1fj17ram77n5w, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ *
from x1 where cn >= to_nchar(:a1) and cn <= to_nchar(:a2)

Plan hash value: 2189453339

-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
|* 1 | FILTER | | 1 | | 9999 |00:00:00.08 | 102 |
|* 2 | TABLE ACCESS FULL| X1 | 1 | 1 | 9999 |00:00:00.03 | 102 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(SYS_OP_C2C(:A1)<=SYS_OP_C2C(:A2))
2 - filter(("CN"<=SYS_OP_C2C(:A2) AND "CN">=SYS_OP_C2C(:A1)))

As you can see the bind variables are converted to national character set using TO_NCHAR function. Column X1.CN is of type NVARCHAR2. It is easy to spot the difference between the SQL statement and the predicate information from the execution plan: to_nchar(:a2) from SQL statement is transformed to SYS_OP_C2C(:A2) in predicate information. The internal Oracle function SYS_OP_C2C performs conversion from one character set to another character set - C(haracterSet)2C(haracterSet). There are situations when one will see this conversion going on without explicit command as in this case what should be a sign that the data types are not the same and implicit conversion is taking place and this might be also a problem from performance perspective as it may disable index usage.

Jonathan Lewis and Charles Hooper, my fellows from the OakTable, had a discussion about this internal function back in 2007.

Saturday 5 December 2009

Measurement Error Trap In Trace File (event 10046)


Some time ago I had an interesting case which I can use to clearly describe how one can be caught in measurement error trap.

But let us start at the beginning with this response time analysis:


Response Time Component Time % Elap AvgEla
---------------------------------------- ----------- ------- ---------
CPU service 3934.97s 48.39% 0.000716
un-accounted for time 1363.01s 16.76%
db file sequential read 1122.00s 13.80% 0.032253
gc buffer busy 451.73s 5.56% 0.011746
log buffer space 451.64s 5.55% 0.123974
buffer busy waits 176.79s 2.17% 0.029579
gc cr block 2-way 156.49s 1.92% 0.003287
gc cr grant 2-way 100.20s 1.23% 0.006541
latch: cache buffers chains 98.92s 1.22% 0.005708
gc current grant 2-way 69.68s 0.86% 0.006728
latch: library cache 30.10s 0.37% 0.010030
row cache lock 28.95s 0.36% 0.018727
gc current block 2-way 26.72s 0.33% 0.003828
gc cr block busy 19.35s 0.24% 0.006802
gc current grant busy 15.30s 0.19% 0.004999
latch: row cache objects 14.28s 0.18% 0.006165
gc cr block 3-way 11.73s 0.14% 0.002952
gc current block 3-way 11.34s 0.14% 0.003440
log file sync 10.71s 0.13% 0.315066
enq: SQ - contention 9.14s 0.11% 0.060911

My first thought was that there is an I/O problem as the average single block I/O took 32 milliseconds.









After digging for a while I have produced the following graphs which represent the I/O timing.











Let's first look at some facts:
- 3 node RAC
- Same storage
- Single block read time for Instance 1 was substantially different from read times for other instances during off hours
- 5 batch jobs during off hours
- The timings for Instance 1 are obviously not correct

I used the fact that I was performing the analysis on a 3 node RAC to check what are the timings on other two nodes and as you can see from the second graph, they were quite different.

Here is now the explanation:

The output from strace showed:


gettimeofday({1159440978, 931945}, NULL) = 0
pread(14, "\6\242\0\0\375\23\0\2+\254.\0\0\0\1\6\0054\0\0\1\0\5\0"..., 8192, 455 057408) = 8192
gettimeofday({1159440978, 944159}, NULL) = 0



Oracle records time just before performing a system call (pread). When the system call completes Oracle again records the current time and the difference reports as a wait time. Unfortunately due to high CPU load the process was for quite a while waiting at the system level to get on CPU and only then was able to read the current time and therefore the reported elapsed time was quite exaggerated.


Conclusion: Waiting in runque for CPU exaggerates all wait times of the process.

It is always a good practice to confirm findings with a different method. In this case I could use also the operating system tools to measure I/O timings. The same situation one can have on any other kind of wait event but there is not always a possibility to check it independently.

Finally, there was not really an I/O problem but the system was quite CPU bound. Of course the average single block I/O time over 5 - 10 ms shows that probably we are experiencing also I/O bottleneck. The customer later on replaced disk storage with a faster one together with the HW used for RAC.

Thursday 3 December 2009

V$SQL_MONITOR and V$SQL_PLAN_MONITOR

In my recent presentation at UKOUG 2009 in Birmingham I also mentioned the new feature of Oracle11gR1 which is a neat solution for monitoring long running SQL statements. It captures statistics about SQL execution every second.

For parallel execution every process involved gets separate entries in V$SQL_MONITOR and V$SQL_PLAN_MONITOR.

It is enabled by default for long running statements if parameter CONTROL_MANAGEMENT_PACK_ACCESS if it is set to “DIAGNOSTIC+TUNING” and STATISTICS_LEVEL=ALL|TYPICAL

It can be enabled at statement level as well with /*+ MONITOR */ hint or disabled with /*+ NO_MONITOR */ hint.

There are some defaults defined which can be altered by setting hidden parameters:
_sqlmon_max_plan - Maximum number of plans entries that can be monitored. Defaults to 20 per CPU
_sqlmon_max_planlines - Number of plan lines beyond which a plan cannot be monitored (default 300)

Just for demonstration purposes I used a simple performance killing statement which runs quite for a while and could be used to test this new feature.



SQL_ID b0zm3w4h1hbff, child number 0
-------------------------------------
select count(*) from obj$,obj$,obj$ --performance killer

Plan hash value: 3679021907

----------------------------------------------------
| Id | Operation | Name | E-Rows |
----------------------------------------------------
| 1 | SORT AGGREGATE | | 1 |
| 2 | MERGE JOIN CARTESIAN | | 341T|
| 3 | MERGE JOIN CARTESIAN | | 4886M|
| 4 | INDEX FAST FULL SCAN | I_OBJ1 | 69901 |
| 5 | BUFFER SORT | | 69901 |
| 6 | INDEX FAST FULL SCAN| I_OBJ1 | 69901 |
| 7 | BUFFER SORT | | 69901 |
| 8 | INDEX FAST FULL SCAN | I_OBJ1 | 69901 |
----------------------------------------------------

SQL> SELECT status, KEY, SID, sql_id, elapsed_time, cpu_time, fetches, buffer_gets,
2 disk_reads
3 FROM v$sql_monitor where sql_id='b0zm3w4h1hbff';

STATUS KEY SID SQL_ID ELAPSED_TIME CPU_TIME FETCHES BUFFER_GETS DISK_READS
--------- ----------- --- ------------- ------------ ---------- ------- ----------- ----------
EXECUTING 21474836481 170 b0zm3w4h1hbff 674281628 624578125 0 0 0

SQL> SELECT plan_line_id, plan_operation || ' ' || plan_options operation,
2 starts, output_rows
3 FROM v$sql_plan_monitor where key=21474836481
4 ORDER BY plan_line_id;

PLAN_LINE_ID OPERATION STARTS OUTPUT_ROWS
------------ --------------------- ---------- -----------
0 SELECT STATEMENT 1 0
1 SORT AGGREGATE 1 0
2 MERGE JOIN CARTESIAN 1 4283731363
3 MERGE JOIN CARTESIAN 1 156731
4 INDEX FAST FULL SCAN 1 3
5 BUFFER SORT 3 156731
6 INDEX FAST FULL SCAN 1 70088
7 BUFFER SORT 156731 4283731363
8 INDEX FAST FULL SCAN 1 70088

The above result I got after running this statement for about 10 minutes. There are of course more statistics available which I have omitted here as they were useless for my demo case which had all data already in buffer cache.

One can use DBMS_SQLTUNE.REPORT_SQL_MONITOR() function to produce a formatted report of monitoring results. Unfortunately the below report is too wide to be cleary seen. I got it for a statement run by DBMS_STATS package


SQL> set long 10000000
SQL> set longchunksize 10000000
SQL> set linesize 200
SQL> select dbms_sqltune.report_sql_monitor from dual;

SQL Monitoring Report

SQL Text
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
select /*+ no_parallel_index(t, "I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST") dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring no_expand
index(t,"I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST") */ count(*) as nrw,count(distinct sys_op_lbid(461,'L',t.rowid)) as nlb,count(distinct
hextoraw(sys_op_descend("OBJ#")||sys_op_descend("INTCOL#")||sys_op_descend(SYS_EXTRACT_UTC("SAVTIME"))||sys_op_descend("COLNAME"))) as ndk,sys_op_countchg(substrb(t.rowid,1,15),1)
as clf from "SYS"."WRI$_OPTSTAT_HISTGRM_HISTORY" t where "OBJ#" is not null or "INTCOL#" is not null or SYS_EXTRACT_UTC("SAVTIME") is not null or "COLNAME" is not null
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Global Information

Status : DONE (ALL ROWS)
Instance ID : 1
Session ID : 127
SQL ID : 1d7qn8b9wam8h
SQL Execution ID : 16777216
Plan Hash Value : 3428983716
Execution Started : 12/04/2009 01:09:20
First Refresh Time : 12/04/2009 01:09:28
Last Refresh Time : 12/04/2009 01:09:30

--------------------------------------------------------------------
| Elapsed | Cpu | IO | Other | Fetch | Buffer | Reads |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls | Gets | |
--------------------------------------------------------------------
| 8.65 | 1.94 | 6.58 | 0.13 | 1 | 1327 | 694 |
--------------------------------------------------------------------


SQL Plan Monitoring Details
=========================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Starts | Rows | Memory | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | (Max) | (percent) | (sample #) |
=========================================================================================================================================================================
| 0 | SELECT STATEMENT | | | 1026 | 1 | +10 | 1 | 1 | 274K | | |
| 1 | SORT GROUP BY | | 1 | | 6 | +5 | 1 | 1 | 274K | 37.50 | Cpu (3) |
| 2 | INDEX FULL SCAN | I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST | 179K | 1026 | 8 | +3 | 1 | 179K | | 62.50 | db file sequential read (5) |
=========================================================================================================================================================================