Introduction
The purpose of the Cost Based Optimizer (CBO) is to determine the most efficient execution plan for your queries. It makes these decisions based on the statistical information it has about your data and by leveraging Oracle database features such as hash joins, parallel query, and Oracle Partitioning. After an upgrade, the CBO is expected to generate the same or a better performing execution plan for most SQL statements. Still it is inevitable that the CBO will generate a sub-optimal plan for some SQL statements in the new release compared to the prior release. This document aims to prepare you to understand the Optimizer behavior in Oracle Database 9i and Oracle Database 10g by introducing the new features and what steps you should take to minimize plan regressions and help you deal any plan changes that do occur with respect to Siebel CRM applications.
This paper is going to divide three sections the first outlines the new features in the Optimizer and statistics areas,
New optimizer and statistics feature in 10g.
Init.ora parameters
Several of the initialization parameters that govern Optimizer behavior have new default values in Oracle Database 10g. There is also one new initialization parameter. Below are the details on what parameters have new default values and details on the new parameter.
Optimizer_ mode
The parameter optimizer_mode has a new default value of ALL_ROWS in Oracle database 10g.This means the Optimizer will no longer operate under RULE mode (RBO) when a table has no statistics. In Oracle database 10g the Optimizer operates under ALL_ROWS mode (CBO) and will use dynamic sampling to get statistics for any tables that do not have statistics and will use CBO. The other possible values are FIRST_ROWS_1, FIRST_ROWS_10, FIRST_ROWS_100, and FIRST_ROWS_1000. The CHOOSE, RULE, and FIRST_ROWS modes have been deprecated.
Optimizer_dynamic_sampling
The parameter optimizer_dynamic_sampling has a new default value of 2 in Oracle Database 10g. This means dynamic sampling will be applied to all unanalyzed tables. It also means that twice the number of blocks will be use to calculate statistics than were used in Oracle database 9i. The default value for dynamic sampling in 9i was 1.
Optimizer_secure_view_merging
A new init.ora parameter called optimizer_secure_view_merging has been introduced in Oracle Database 10g with a default value of TRUE. The setting of this parameter disables some unsafe view merging capabilities that were present in Oracle Database 9i. When a SQL statement that refers to a view is parsed, the view referenced in a query is expanded into a separate query block, which represents the view definition, and therefore the result of the view. The view text is then merged into the original SQL statement and the new combined query is optimized as a whole. View merging is a SQL transformation that is completely internal to the Optimizer and transparent to the end user.
Parameter 9.2 Value 10gR2 Value
Optimizer_mode CHOOSE ALL_ROWS
Optimizer_dynamic_sampling 1 2
Optimizer_secure_view_merging N/A TRUE
1 Summary of init.ora parameter changes between 9i and 10g
Changes to the DBMS_STATS package
In Oracle 8i a new PL/SQL package, called DBMS_STATS was introduced to gather and manage optimizer statistics. DBMS_STATS is Oracle’s preferred method for gathering statistics and replaces the ANALYZE command for collecting statistics. The DBMS_STATS package has been extended in Oracle Database 10g to accommodate new types of statistics and monitoring data that can now be collected. Changes have also been made to the default value for several of the parameters used in the gather statistics procedures. There is also a new automatic statistics gathering job that is enabled by default in 10g.
New subprograms in the DBMS_STATS package
System statistics
In Oracle Database 9i system statistics were introduce to enable the CBO to effectively cost each operation in an execution plan, by using information about the actual system hardware executing the statement, such as CPU speed and IO performance. However, if system statistics were not gathered in 9i the CBO would revert back to the costing model present in Oracle Database 8i.
In Oracle Database 10g the use of systems statistics is enabled by default and system statistics are automatically initialized with heuristic default values; these values do not represent your actual system. When you gather system statistics in Oracle Database 10g they will override these initial values. To gather system statistics you can use DBMS_STATS.GATHER_SYSTEM_STATS during your peak
workload time window. At the beginning of the peak workload window execute the following command:
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS (‘START’);
END;
/
At the end of the peak workload window execute the following command:
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS (‘END’);
END;
/
Oracle recommends gathering system statistics during a representative workload, ideally at peak workload time. You only have to gather system statistics once. System statistics are not automatically collected as part of the new statistics gather job (see the automatic statistics gathering job section below for more details).
Statistics on Dictionary Tables
Since the default value for optimizer_mode in Oracle Database 10g forces the use of the CBO, all tables in the database need to have statistics including all of the dictionary tables (tables owned by ‘sys’ and residing in the system tablespace). During the upgrade process Oracle will automatically gathers statistics on the dictionary tables. Appendix C of the Oracle® Database Upgrade Guide provides scripts that collect optimizer statistics for dictionary objects. By running these scripts prior to performing the actual database upgrade, you can decrease the amount of downtime incurred during the database upgrade. Statistics on the dictionary tables will be maintained via the automatic statistics gathering job run during the nightly maintenance window. If you choose to switch off the automatic statistics gathering job for your main application schema consider leaving it on for the dictionary tables. You can do this by changing the value of AUTOSTATS_TARGET to ORACLE instead of AUTO using the procedure DBMS_STATS.SET_PARAM
BEGIN
DBMS_STATS.SET_PARAM(AUTOSTATS_TARGET,'ORACLE');
END;
/
Statistics on Fixed Objects
You will also need to gather statistics on dynamic performance tables (fixed objects) these are the X$ tables on which the V$ view (V$SQL etc.) are built. Fixed objects now need statistics due to the new default value for optimizer_mode. It’s important to gather statistics on the fixed objects as they are often queried to supply information to Statspack and the new Automatic Workload Repository (AWR) in Oracle Database 10g and you need to give the CBO accurate statistics for these objects. You only need to gather fixed objects statistics once for a representative workload and they are not updated by the automatic statistics gathering job. You can collect statistics on fixed objects using DBMS_STATS.GATHER_FIXED_OBJECTS_STATS.
BEGIN
DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;
END;
/
Restoring Statistics
In Oracle Database 10g when you gather statistics using DBMS_STATS, the original statistics are automatically kept as a backup in dictionary tables and can be easily restored by running DBMS_STATS.RESTORE_TABLE_STATS if the newly gathered statistics prove to be sub-optimal.
The example below restores the statistics for the table SALES back to what they were yesterday and automatically invalidates all of the cursors referencing the SALES table. We want to invalidate all of the cursors because we are restoring yesterday’s statistics since today’s statistics gave us an unacceptable plan. The value of the no_invalidate parameter determines if the cursors referencing the table will be invalidated or not.
BEGIN
DBMS_STATS.RESTORE_TABLE_STATS('SH','SALES',SYSTIMESTAMP-1,false, false);
END;
/
Comparing Statistics
When it comes to deploying a new application or application module it is standard practice to test and tune the application in a test environment before it is moved to production. However, even with testing it’s possible that SQL statements in the application will have different execution plans in production then they did on the test system. One of the key reasons an execution plan can differ from one system to another (from test and production) is because the optimizer statistics on each system are different. In Oracle Database 10g Release 2, the DIFF_TABLE_STATS_* (This features available from 10.2.0.4) functions can be used to compare statistics for a table from two different sources. The statistics can be from:
• A user statistics table and current statistics in the dictionary.
• A single user statistics table containing two sets of statistics that can be identified using statids.
• Two different user statistics tables.
• Two points in history.
The function also compares the statistics of the dependent objects (indexes, columns, partitions). The function displays statistics for the object(s) from both sources if the difference between the statistics exceeds a certain threshold (%). The threshold can be specified as an argument to the function; the default value is 10%. The statistics corresponding to the first source will be used as the basis for
computing the differential percentage.
In the example below we compare the current dictionary statistics for the table EMP with the statistics for EMP in the stats table tab1; the SQL statement will generate a report-like output on the screen.
SQL> select report, maxdiffpct from
table (dbms_stats.diff_table_stats_in_stattab(‘SCOTT’,’EMP’,’TAB1’ ));
More examples of how to use the new DIFF_TABLE_STATS_* functions and an example of it’s output can be found in Appendix A.
Parameter 9.2 Value 10gR2 Value
METHOD_OPT FOR ALL COLUMNS SIZE 1 FOR ALL COLUMNS SIZE AUTO
ESTIMATE_PRECENT 100 (Compute) DBMS_STATS.AUTO_SAMPLE_SIZE
GRANULARITY DEFAULT(Table&Partition) AUTO
CASCADE FALSE DBMS_STATS.AUTO_CASCADE
NO_VALIDATE FALSE DBMS_STATS.AUTO_INVALIDATE
2 Default values for parameters used in DBMS_STATS
The METHOD_OPT parameter controls the creation of histograms during statistics collection. With the new default value of FOR ALL COLUMNS SIZE AUTO, Oracle automatically determines which columns require histograms and the number of buckets that will be used based on the column’s usage statistics. A column is a candidate for a histogram if it has been seen in a where clause predicate e.g. an equality, range, LIKE, etc. Oracle will verify whether the column is skewed before creating a histogram, for example a unique column will not have a histogram created on it.
The ESTIMATE_PERCENT parameter determines the percentage of rows used to calculate the statistics. In Oracle Database 9i the default percentage was 100% or all of the rows in the table. However, in Oracle Database 10g statistics are gathered using a sampling method. Oracle automatically determines the appropriate sample size for every table in order to get accurate statistics.
The GRANULARITY parameter dictates at which level statistics will be gathered. The possible levels are table (global), partition, or subpartition. With the new default setting of AUTO Oracle will determine the granularity based on the objects partitioning type.
The CASCADE parameter determines whether or not statistics are gathered for the indexes on a table. In Oracle Database 10g, this parameter is set to DBMS_STATS.AUTO_CASCADE by default, which means Oracle will determine whether index statistics need to be collected or not.
In Oracle Database 9i the NO_INVALIDATE parameter determined if the dependent cursors will be invalidated immediately after statistics are gathered or not. With the new setting of DBMS_STATS.AUTO_INVALIDATE in Oracle database 10g, cursors that have already been parsed will not be invalidated immediately. They will continue to use the plan that was obtained using the original statistics until Oracle decides to invalidate the dependent cursors. The invalidations will happen gradually
over time to ensure there is no performance impact on the shared pool as there could be if all of the dependent cursors were hard parsed all at once.
Changing the default parameter values for DBMS_STATS
To change the default value for any of the parameters used by the DBMS_STATS subprograms, use the DBMS_STATS.SET_PARAM procedure, e.g:
BEGIN
DBMS_STATS.SET_PARAM ('CASCADE','DBMS_STATS.AUTO_CASCADE');
END;
/
Histograms
With the new default setting for the METHOD_OPT parameter Oracle will automatically determine which columns should have histograms created on them when gathering statistics. Oracle bases the decision to create a histogram on internal information recorded about column usage, such as what number and type of WHERE clause predicates (= < > Like etc.) were used for each column. Histograms allow the Optimizer to better estimate the cardinality of a particular column after applying all where clause predicates. A histogram is a series of buckets, where the number of values occurring within a range is tracked in these buckets. The range for each bucket is established during statistics collection. Histograms are potentially useful when:
1. The column is used in an equality predicate or an equi-join predicate AND there are frequency skews in the column data.
2. The column is used in a range or like predicate AND there are either or both frequency skews or range skews in the column.
Information about existing histograms can be viewed in *_TAB_HISTOGRAMS and *_TAB_COL_STATISTICS.
SELECT S.TABLE_NAME, S.COLUMN_NAME, S.HISTOGRAM
FROM USER_TAB_COL_STATISTICS S;
Histograms and Bind Peeking
When optimizing a SQL statement that contains bind variables in the WHERE clause the Optimizer peeks at the values of these bind variables on the first execution (during hard parse). The Optimizer determines the execution plan based on the initial bind values. On subsequent executions of the query, no peeking takes place (no hard parse happens), so the original execution plan will be used by all future executions, even if the values of the bind variables change. The presence of a histogram on the column used in the expression with the bind variable may cause a different execution plan to be generated for the statement depending on the initial value of the bind variable being peeked, so the execution plan could vary depending on the values of the bind variables on its first invocation. This issue may surface in light of the change in the default behavior in DBMS_STATS (see the section on the METHOD_OPT parameter in New Default Parameter Values for DBMS_STATS) If this change causes performance problems then you can re-gather statistics on this table without histograms or change the value of METHOD_OPT parameter.
Peeking of User Binds in Siebel CRM
In Siebel CRM application, Siebel Object Manager (OM) may generate a SQL statement with the syntax of :n in the WHERE clause. The :n is placeholder for values to be replaced at run-time. At run-time, Siebel OM substitutes the real values for the placeholder(s) to seek a better query plan. This process is called “peeking of user binds” or simply “bind peek”.
select LAST_NAME from S_CONTACT where TIMEZONE_ID=:n;
On a hard parse, Oracle will peek the value provided for :n, and optimize the query as if the same query was submitted with this particular literal value. This allows the CBO to leverage optimizer statistics and generate the most efficient query plan for given value. Disabling of bind peeking, or inaccurate statistics, or excessive histogram statistics may equally result in generation of suboptimal query execution plans, and noticeable degradation in Siebel CRM application views response times.
Recommended parameters setting for Oracle 9i and Oracle 10g database with Siebel.
Below are the recommended values for Oracle database with Siebel CRM application. This parameter worked for most of the platform and different loads. This parameter may require tweaking depending upon hardware, software and workload configurations.
Note: It is advisable to consult Oracle or Siebel support before setting any hidden parameter at database level. You should not implement any hidden parameter until bug symptoms are not confirmed, and hidden parameter are recommended by Oracle/Siebel Support. Below mentioned parameter are identified as best practice because of known bugs in oracle and implemented as workaround to fix and improve the performance of overall system. You should also take care of impact on system before implementing any parameter at database level.
Parameter Oracle 9i Oracle 10g Rel2 Comments
Optimizer_mode CHOOSE ALL_ROWS This parameter specifies the whether to use CBO or RBO as mentioned above. RBO is depreciated by Oracle from Oracle 10g onwards.
Optimizer_dynamic_sampling 1 2 This parameter controls the no of blocks read by dynamic sampling query. This parameter will help in creating more accurate selectivity and cardinality estimates, which helps the CBO to generate the faster execution plan. By default oracle will sample 32 random blocks and dynamic sampling will not occur on the table that contains less than 32 blocks.
Optimizer_max_permutations 100 2000 This parameter limits the optimizer maximum join permutations per query block. Starting with Oracle Database 10g, optimizer_max_permutations is an obsolete parameter. It is recommended to leave this hidden parameter setting to its default value.
Optimizer_feature_enable 9.2.0 10.2.0 This parameter specifies that which optimizer version’s functionality to use. Setting the value to old version will allow you to use old version capabilities.
Optimizer_index_cost_adj 1 10 This parameter controls the access path selection to be more or less index friendly. Setting this parameter value between 10 and 30 may result in huge performance gains as SQL statement changes from large-table full table scan to index range scans. Setting the value of too low optmizer_index_cost_adj=1, the CBO will still choose FTS for such as very small tables that resides on two blocks.
Optimizer_index_caching 0 100 This parameter to specify oracle the percentage of index block it could expect to find it in buffer cache on average. The default value of zero cause’s oracle to believe the cache is devoid of index blocks. A maximum value of 100 causes Oracle to believe that the cache has the entire index block. The value you want to use is hard to pinpoint precisely, but it is somewhere between 0 and 100.
Db_block_size 8 8 This parameter specifies the initial size of database block. For OLTP application, ideal value is 8 KB.
Db_multiblock_read_count 32 32 For most implementations, this value should be set between 16 and 32 blocks and adjusted as necessary. You may want to set an initial value of 32
Cursor_sharing Exact Exact This parameter specifies the plan sharing among the sql’s. For Siebel application this parameter must be set to “Exact” always.
_no_or_expand True True This parameter specifies 'OR' expansion during optimization is disabled. Setting this parameter to TRUE is providing better performance. However, Siebel does not perform better consistently with this setting. It is recommended to set this parameter to FALSE
_b_tree_bitmap_plans False False This parameter enables the use of bitmap plans for tables with only Btree indexes.
_gc_defer_time 0 0 This parameter applies to RAC databases. Setting this parameter to 0, will lead to immediate transfer of deferred blocks from existing node to requesting node which going to reduce the latency between cluster nodes and improve the performance.
_always_semi_joins OFF OFF This parameter specifies to always use semi-joins when possible. Setting this parameter to OFF renders better performance. This happens due to Siebel CRM OM’s FIRST_ROWS_10 optimization strategy. However, it is recommended that a customer run a thorough performance testing to verify whether this setting improves performance in the Siebel environment
_partition_view_enabled False False This parameter enables/disable the use of partitioned views. For Siebel OLTP environment, it is recommended to set this parameter to False.
Hard-coded Session Parameters for Siebel OM Sessions
In Siebel 7.7/7.8 on Oracle 9i platform, Siebel database connector has been modified to make a few alter session statements for OLTP operations. Same functionality has later been built into Siebel database connectors for Oracle 10g and 11g. Thus, in all current Siebel CRM versions (7.7, 7.8, 8.0, 8.1 and 8.1.1) each Siebel OM session will automatically set up the following session parameters:
• ALTER SESSION SET OPTIMIZER_MODE = FIRST_ROWS_10
When optimizer_mode is set to first_rows_n, the optimizer will in all cases use cost-based optimization, and set the optimizer goal for best response time (versus best throughput).
• ALTER SESSION SET HASH_JOIN_ENABLED = FALSE
(In Oracle 10g, this parameter is deprecated, Siebel uses _hash_join_enabled instead)
• ALTER SESSION SET _OPTIMIZER_SORTMERGE_JOIN_ENABLED = FALSE
• ALTER SESSION SET _OPTIMIZER_JOIN_SEL_SANITY_CHECK = TRUE
This is done to enable sanity checks for join using two columns. The above session parameters should not be altered to other values by any means.
your feedback is important. Thanks.
APPENDIX A: THE NEW DIFF_TABLE_STATS_* FUNCTION
In Oracle Database 10.2.0.4 the DBMS_STATS.DIFF_TABLE_STATS_* functions can be used to compare statistics for a table from two different sources. The statistics can be from:
A user statistics table and the current dictionary statistics
A single user statistics table containing two sets of statistics that can be identified using different statids
Two different user statistics tables
Two different points in history
The functions are defined as
DBMS_STATS.DIFF_TABLE_STATS_IN_STATTAB(
ownname IN VARCHAR2,
tabname IN VARCHAR2,
stattab1 IN VARCHAR2,
stattab2 IN VARCHAR2 DEFAULT NULL,
pctthreshold IN NUMBER DEFAULT 10,
statid1 IN VARCHAR2 DEFAULT NULL,
statid2 IN VARCHAR2 DEFAULT NULL,
stattab1own IN VARCHAR2 DEFAULT NULL,
stattab2own IN VARCHAR2 DEFAULT NULL)
RETURN DiffRepTab pipelined;
DBMS_STATS.DIFF_TABLE_STATS_IN_HISTORY(
ownname IN VARCHAR2,
tabname IN VARCHAR2,
time1 IN TIMESTAMP WITH TIME ZONE,
time2 IN TIMESTAMP WITH TIME ZONE DEFAULT
NULL, pctthreshold IN NUMBER DEFAULT 10)
RETURN DiffRepTab pipelined;
Below are examples of possible use cases for the diff_table_stats functions. Comparing statistics found in a user statistics table to those currently in the dictionary for a given table In this example we compare the statistics in the user statistic table TAB1 with the current dictionary statistics for the table EMP.
SQL> select report, maxdiffpct from
table(dbms_stats.diff_table_stats_in_stattab(null,'emp','tab1'));
Comparing two sets of statistics identified by different statids in the same user statistics table In this example we compare two different sets of statistics, for the table EMP, which are stored in the user statistics table TAB1. Each set of statistics is identified by a different statid (stats1, stats2)
SQL> select report, maxdiffpct from
table(dbms_stats.diff_table_stats_in_stattab(null,'emp','tab1',NULL,10,'stats1','stats2'));
Comparing the current statistic for a table with those from a week ago in this example we compare the current statistics for the table EMP with those from a week ago.
SQL> select report, maxdiffpct from
table(dbms_stats.diff_table_stats_in_history(null,’emp’,systimestamp-7));
Example of a DBMS_STATS.DIFF_TABLE_STATS_* report
Below is an example of the report that is generated after running an of the DBMS_STATS.DIFF_TABLE_STATS_* functions.
DBMS_STATS.DIFF_TABLE_STATS_IN_STATTAB(NULL,'EMP','TAB1')
################################################
STATISTICS DIFFERENCE REPORT FOR:
TABLE : EMP
OWNER: SCOTT
SOURCE A: User statistics table TAB1
: Statid :
: Owner : SCOTT
SOURCE B: Current Statistics in dictionary
PCTTHRESHOLD: 10
NO DIFFERENCE IN TABLE / (SUB) PARTITION STATISTICS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
COLUMN STATISTICS DIFFERENCE:
COL_NAME SRC NDV DENSITY HIST NULLS LEN MIN MAX SIZE
DEPTNO A 3 .333333333 NO 0 3
C10B C11F 14
B 3 .035714285 YES 0
Wednesday, July 22, 2009
Subscribe to:
Post Comments (Atom)
2 comments:
The parameters mentioned are very widely used and are quite useful for effecient running of a database system.
Maybe enhanced debugging can improve the system further in years to come.
Good job Mr. Pathak ;)
just found this
great info
thanks
Post a Comment