오라클 매거진 2009.1월에 연재된 그 유명한 TOM의 dynamic sampling에 대한 기고문입니다.
On Dynamic Sampling
By Tom Kyte
Our technologist samples dynamically, considers usage, and sets levels.
My questions are related to dynamic
sampling. What does it really do, when would I consider using it, and
what are the meanings of all the different levels it can be set to?
Dynamic sampling first became available in Oracle9i
Database Release 2. It is the ability of the cost-based optimizer (CBO)
to sample the tables a query references during a hard parse, to
determine better default statistics for unanalyzed segments, and to
verify its “guesses.” This sampling takes place only at hard parse time
and is used to dynamically generate better statistics for the optimizer
to use, hence the name dynamic sampling.
The optimizer uses a variety of inputs to come
up with a plan. It uses any and all constraints defined on the table;
system statistics—information about your server’s I/O speeds, CPU
speed, and the like; and statistics gathered from the segments involved
in the query. The optimizer uses statistics to estimate
cardinalities—the number of rows each step in a given plan is expected
to return—and those cardinalities are a major variable in computing the
cost of a query. When cardinalities are incorrectly estimated, the
optimizer may choose an inefficient query plan. The No. 1, and some
might say only, reason for an inefficient plan’s being generated by the
optimizer is inaccurate cardinality estimations. I like to say “right
cardinality equals right plan; wrong cardinality equals wrong plan.”
So, “right” is the motivation behind dynamic
sampling: to help the optimizer get the right estimated cardinality
values. Feeding the optimizer more information and more-accurate
information, specific to the query itself, will help the optimizer come
up with the optimal execution plan.
Dynamic sampling offers 11 setting levels (0 through 10), and I’ll explain the different levels, but note that in Oracle9i Database Release 2, the default dynamic sampling level value is 1, whereas in Oracle Database 10g Release 1 and above, it defaults to 2.
Ways Dynamic Sampling Works
There are two ways to use dynamic sampling:
- The OPTIMIZER_DYNAMIC_SAMPLING parameter
can be set at the database instance level and can also be overridden at
the session level with the ALTER SESSION command.
- The DYNAMIC_SAMPLING query hint can be added to specific queries.
In this column, I’m going to use the hinting
capability to demonstrate the effect of dynamic sampling, but you may
well use the session-level capability to increase the use of dynamic
sampling, especially in a reporting or data warehouse situation.
As stated before, dynamic sampling is used to
gather statistics for unanalyzed segments and to verify “guesses” made
by the optimizer. I’ll look at each of these uses in turn.
Unanalyzed Segments
The optimizer will use default statistic values
if a segment is not analyzed and you do not use dynamic sampling to get
a quick estimate. These default cardinality values are documented in Oracle Database Performance Tuning Guide.
These default statistic values are typically not very realistic,
because using them is a one-size-fits-all approach. The estimated row
counts are based on guesses at the number of blocks of data in the
table and an average row width. Without dynamic sampling, these guesses
will be off—by a large amount. Consider:
SQL> create table t
2 as
3 select owner, object_type
4 from all_objects
5 /
Table created.
SQL> select count(*) from t;
COUNT(*)
------------------------
68076
Now I’ll look at the estimates for a query that
accesses this unanalyzed table. To see the default cardinalities the
optimizer would use, I have to disable dynamic sampling (it is enabled
by default in Oracle9i Database Release 2 and above). I achieve
this via the DYNAMIC_SAMPLING hint, with a level of zero—zero being the
value that disables dynamic sampling—as shown in Listing 1.
Code Listing 1: Disabling dynamic sampling to see default cardinalities
SQL> set autotrace traceonly explain
SQL> select /*+ dynamic_sampling(t 0) */ * from t;
Execution Plan
------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16010 | 437K| 55 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 16010 | 437K| 55 (0)| 00:00:01 |
--------------------------------------------------------------------------
As you can see, the estimated cardinality is
16,010, which is very far off from the real cardinality, about 68,000.
If I permit dynamic sampling, I get a much more realistic cardinality
estimate, as shown in Listing 2.
Code Listing 2: More-realistic cardinalities with dynamic sampling enabled
SQL> select * from t;
Execution Plan
------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 77871 | 2129K| 56 (2)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 77871 | 2129K| 56 (2)| 00:00:01 |
--------------------------------------------------------------------------
Note
------------------------------------------
- dynamic sampling used for this statement
Now, 77,871 is not exactly 68,000 (obviously),
but it is much closer to reality than 16,010 was. In general, the
optimizer will choose better query plans for queries accessing this
unanalyzed table when using dynamic sampling.
An inaccurate cardinality estimate can swing
either way, of course. In Listing 1, I showed the optimizer radically
underestimating the cardinality, but it can overestimate as well.
Consider the estimate in Listing 3.
Code Listing 3: Overestimating cardinalities
SQL> delete from t;
68076 rows deleted.
SQL> commit;
Commit complete.
SQL> set autotrace traceonly explain
SQL> select /*+ dynamic_sampling(t 0) */ * from t;
Execution Plan
------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16010 | 437K| 55 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 16010 | 437K| 55 (0)| 00:00:01 |
--------------------------------------------------------------------------
SQL> select * from t;
Execution Plan
-----------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 28 | 55 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T | 1 | 28 | 5 (0)| 00:00:01 |
--------------------------------------------------------------------------
Note
---------------------------------------
- dynamic sampling used for this statement
Think about what might happen if the optimizer
guessed 16,010 rows instead of 1 row in this case. For queries
accessing table T, the optimizer would grossly overestimate the rows
that will be returned from T and generate incorrect plans as a result.
So, where would dynamic sampling be useful?
First and foremost, dynamic sampling is useful when you are accessing
any table that has been created and loaded but not yet analyzed.
Starting in Oracle Database 10g Release 1, the CBO is the only
supported optimizer, and it needs accurate statistics to perform its
job correctly. If a table exists that hasn’t had statistics gathered
yet, the optimizer will be flying blind. Dynamic sampling gives the CBO
the information it needs in order to operate correctly.
The second use for dynamic sampling is with
global temporary tables. Often global temporary tables do not have
statistics, and dynamic sampling can provide the optimizer with
information about these tables. Your application would load the global
temporary table, and the first hard parse of any query that utilized
the temporary table would dynamically sample it to ascertain the
correct size of the temporary table.
When the Optimizer Guesses
In addition to providing the optimizer with
necessary statistics when it is accessing unanalyzed segments, dynamic
sampling may also help the optimizer to validate any of its guesses.
The optimizer has access to statistics about “single things” in
general; when you gather statistics by using DBMS_STATS, the optimizer
receives information about
- The table, the number of rows, average row widths, and the like.
- Each individual column, the high value, the
low value, the number of distinct values, histograms (maybe), and the
like. (Oracle Database 11g can even gather statistics on an expression, but it is still a single expression). Additionally, Oracle Database 11g can gather statistics on groups of columns, and these statistics can be used in equality comparisons.
- Each individual index, the clustering factor, the number of leaf blocks, the index height, and the like.
So, given a table with various columns, the
optimizer has lots of information to work with, but it is missing some
vital information, including statistics about how the various columns
interact with each other and statistics about any correlations in
column values. For example, suppose you have a table of census
information that includes a record for everyone on the planet. One of
the table’s attributes—MONTH_BORN_IN—is a character string field
containing each person’s birth month. Another
column—ZODIAC_SIGN—contains character strings with the name of each
person’s zodiac sign.
After gathering statistics, you ask the
optimizer to estimate how many people were born in December, and it
would almost certainly be able to come up with a very accurate estimate
of 1/12 of the data (assuming a normal distribution of birth dates). If
you asked the optimizer to estimate how many people are Pisces, it
would again likely come up with an accurate estimate of 1/12 of the
data again.
So far, so good. But now you ask, “How many
people born in December are Pisces?” All Pisces were born in either
February or March, but the optimizer isn’t aware of that. All the
optimizer knows is that December will retrieve 1/12 of the data and
that Pisces will retrieve 1/12 of the data; it assumes that the two
columns are independent and, using very simple logic, says, “The number
of people born in December who are also Pisces will be 1/12 times 1/12,
or 1/144, of the data.” The actual number of rows—zero—will be very far
off from the optimizer’s guess of 1/144th of the rows in the table, and
the result is typically a suboptimal plan, due to the poor cardinality
estimates.
Dynamic sampling can help solve this. When it is
set high enough, to level 3 or above, the optimizer will validate its
guesses by using a dynamic sample.
To demonstrate this, I’ll create a table with
some very specific data. Note that in this table, if the FLAG1 column
has a value of Y, the FLAG2 column will have a value of N, and vice
versa. All the data is either Y, N, or N, Y—there are no Y, Y records
and no N, N records. Listing 4 shows the creation of the table and the
gathering of statistics.
Code Listing 4: Creating the “FLAG” table and gathering statistics
SQL> create table t
2 as select decode( mod(rownum,2), 0, 'N', 'Y' ) flag1,
3 decode( mod(rownum,2), 0, 'Y', 'N' ) flag2, a.*
4 from all_objects a
5 /
Table created.
SQL > create index t_idx on t(flag1,flag2);
Index created.
SQL > begin
2 dbms_stats.gather_table_stats
3 ( user, 'T',
4 method_opt=>'for all indexed columns size 254' );
5 end;
6 /
PL/SQL procedure successfully completed.
So I have the table and have gathered
statistics, including histograms for the FLAG1 and FLAG2 columns. The
following shows the number of rows in the table, half the number, and a
quarter of the number:
SQL> select num_rows, num_rows/2,
num_rows/2/2 from user_tables
where table_name = 'T';
NUM_ROWS NUM_ROWS/2 NUM_ROWS/2/2
-------- ---------- ------------
68076 34038 17019
If I look at the data in the table, I can see
how many of the rows would be returned for FLAG1 = 'N’ (half of the
data, given how I constructed it) and how many would be returned for
FLAG2 = 'N’ (again half of the data). I can verify this by using
autotrace again, as shown in Listing 5.
Code Listing 5: Good cardinality estimates, looking at half of the table data
SQL> set autotrace traceonly explain
SQL> select * from t where flag1='N';
Execution Plan
------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 33479 | 3432K| 292 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T | 33479 | 3432K| 292 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FLAG1"='N')
SQL> select * from t where flag2='N';
Execution Plan
----------------------------
Plan hash value: 1601196873
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 34597 | 3547K| 292 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T | 34597 | 3547K| 292 (1)| 00:00:04 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FLAG2"='N')
So far, so good—the estimated cardinalities are accurate, and the optimizer can generate optimal query plans.
Last, I can see the value the optimizer will
“guess” by default if I query FLAG1 = 'N’ and FLAG2 = 'N’—in this case,
about a quarter of the rows in the table, as shown in Listing 6.
Code Listing 6: Poor cardinality estimates, looking at a quarter of the table data
SQL> select * from t where flag1 = 'N' and flag2 = 'N';
Execution Plan
----------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17014 | 1744K| 292 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T | 17014 | 1744K| 292 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
----------------------------------------------------
1 - filter("FLAG1" = 'N' AND "FLAG2" = 'N')
Listing 6 shows that when the predicate gets
just a little more complex, the optimizer misses the estimated
cardinality by a huge amount—it doesn’t know about the relationship
between FLAG1 and FLAG2. Enter dynamic sampling, shown in Listing 7.
Code Listing 7: Good cardinality estimate, looking at only six rows
SQL> select /*+ dynamic_sampling(t 3) */ * from t where flag1 = 'N' and flag2 = 'N';
Execution Plan
-----------------------------
Plan hash value: 470836197
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 630 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 6 | 630 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T_IDX | 6 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
----------------------------------------------------
2 - access("FLAG1"='N' AND "FLAG2"='N')
Note the much better row estimate (6 instead of
more than 17,000) in Listing 7, compared to Listing 6, and also note
the very different (and now optimal) plan. Instead of a full table
scan, the optimizer has decided to use the index, and this execution
plan will definitely be optimal compared to the original full table
scan, because in reality, no data will be returned. The index will discover that instantly.
The Dynamic Sampling Levels
A frequently asked question about dynamic
sampling is, “What do all of the levels mean?” The answer is pretty
straightforward and documented in Oracle Database Performance Tuning Guide.
I reproduce that information here:
The sampling levels are as follows if the
dynamic sampling level used is from a cursor hint or from the
OPTIMIZER_DYNAMIC_SAMPLING initialization parameter:
- Level 0: Do not use dynamic sampling.
- Level 1: Sample all tables that have
not been analyzed if the following criteria are met: (1) there is at
least 1 unanalyzed table in the query; (2) this unanalyzed table is
joined to another table or appears in a subquery or non-mergeable view;
(3) this unanalyzed table has no indexes; (4) this unanalyzed table has
more blocks than the number of blocks that would be used for dynamic
sampling of this table. The number of blocks sampled is the default
number of dynamic sampling blocks (32).
- Level 2: Apply dynamic sampling to
all unanalyzed tables. The number of blocks sampled is two times the
default number of dynamic sampling blocks.
- Level 3: Apply dynamic sampling to
all tables that meet Level 2 criteria, plus all tables for which
standard selectivity estimation used a guess for some predicate that is
a potential dynamic sampling predicate. The number of blocks sampled is
the default number of dynamic sampling blocks. For unanalyzed tables,
the number of blocks sampled is two times the default number of dynamic
sampling blocks.
- Level 4: Apply dynamic sampling to
all tables that meet Level 3 criteria, plus all tables that have
single-table predicates that reference 2 or more columns. The number of
blocks sampled is the default number of dynamic sampling blocks. For
unanalyzed tables, the number of blocks sampled is two times the
default number of dynamic sampling blocks.
- Levels 5, 6, 7, 8, and 9: Apply
dynamic sampling to all tables that meet the previous level criteria
using 2, 4, 8, 32, or 128 times the default number of dynamic sampling
blocks respectively.
- Level 10: Apply dynamic sampling to all tables that meet the Level 9 criteria using all blocks in the table.
Note that in Oracle9i Database Release 2,
the default setting for dynamic sampling is level 1. At this setting,
the optimizer will tend to use dynamic sampling only in some
cases when a table is unanalyzed. This includes the case of global
temporary tables without statistics but doesn’t ensure that dynamic
sampling will be used against all unanalyzed tables. In Oracle Database 10g
Release 1 and above, the default setting was raised to 2. This setting
ensures that the optimizer will have a good estimate whenever it
optimizes a query that accesses a table with no statistics.
I used level 3 in the “FLAG” table example in
Listing 7. Level 3 instructs the optimizer to collect a sample to
validate a guess it might have made. For example, if I turn on
SQL_TRACE and run the example in Listing 7, I will find the SQL in
Listing 8 in my trace file.
Code Listing 8: Trace file SQL generated by Listing 7 query
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE
NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false')
NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),:"SYS_B_00"),
NVL(SUM(C2),:"SYS_B_01"), NVL(SUM(C3),:"SYS_B_02")
FROM
(SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("T") FULL("T")
NO_PARALLEL_INDEX("T") */ :"SYS_B_03" AS C1, CASE WHEN "T"."FLAG1"=
:"SYS_B_04" AND "T"."FLAG2"=:"SYS_B_05" THEN :"SYS_B_06" ELSE :"SYS_B_07"
END AS C2, CASE WHEN "T"."FLAG2"=:"SYS_B_08" AND "T"."FLAG1"=:"SYS_B_09"
THEN :"SYS_B_10" ELSE :"SYS_B_11" END AS C3 FROM "T" SAMPLE BLOCK
(:"SYS_B_12" , :"SYS_B_13") SEED (:"SYS_B_14") "T") SAMPLESUB
The bold code in Listing 8 shows the optimizer
trying to validate its guess—it is looking for any correlation between
the FLAG1 and FLAG2 columns. By executing this sampling query during
the hard parse, the optimizer was able to figure out that very few rows
would be returned by this WHERE clause, adjust the estimated
cardinality, and arrive at the correct plan.
When to Use Dynamic Sampling
“When should I use dynamic sampling?” is a
tricky question. As with any other feature, there are times to use it
and times to avoid it. So far I’ve concentrated on the “goodness” of
dynamic sampling, and based on that, it seems that you should set the
level to 3 or 4 and just let the optimizer always use dynamic sampling
to validate its guesses.
That makes sense in an environment in which you
spend most of your time executing SQL and very little of your overall
time hard-parsing the SQL. That is, the SQL you are executing runs for
a long time and the parse time is a small portion of the overall
execution time, such as in a data warehousing environment. There,
dynamic sampling at levels above the default makes complete sense. You
are willing to give the optimizer a little more time during a hard
parse (when sampling takes place) to arrive at the optimal plan for
these complex queries.
That leaves the other classic type of
environment: the online transaction processing (OLTP) system. Here, in
general, you are executing queries thousands of times per second and
spend very little time executing a given query—the queries are
typically small and fast. Increasing the parse time in an OLTP system
might well cause you to spend more time parsing than executing SQL. You
do not want to increase the parse times here, so higher levels of
dynamic sampling would not be advisable.
So what happens when you need the benefit of the
dynamic sample at level 3 or above in an OLTP system? That is when it
would be time to look into SQL Profiles, a new feature of Oracle
Database 10g Release 1 and above (download.oracle.com/docs/cd/B28359_01/server.111/b28274/sql_tune.htm#PFGRF02605).
A SQL profile is roughly equivalent, in some
respects, to a really aggressive dynamic sample—it, too, validates any
guesses, but it has the ability to persist this information in the data
dictionary. In some respects, a SQL profile is like a “static sample,”
as opposed to the dynamic samples we’ve been discussing. Using a SQL
profile is a bit like gathering statistics for a query and storing that
information for the optimizer to use at hard parse time—it saves on the
cost of dynamic sampling by “sampling” once and persisting the values.
So, why would you not just always use SQL
Profiles? Well, in a data warehouse and many reporting systems, you do
not utilize bind variables—you put the literal values into the SQL
query itself, so the optimizer has good insight into exactly the data
you are interested in. In a data warehouse system, query performance is
paramount and the users are typically generating SQL in an ad hoc
fashion. They do not run the same SQL from day to day, so there is no
SQL from which to create a SQL profile! The SQL is generated, executed,
and maybe never executed again. Dynamic sampling is perfect for such a
data warehouse situation; it quickly validates the optimizer guesses at
hard parse time and gives the optimizer the information it needs right
then and there. Because that query will likely not be executed over and
over again, the information need not be persisted in the data
dictionary.
In the OLTP system, the opposite is true. You do
use bind variables; you do use the same SQL over and over and over
again. Avoiding any bit of work you can at runtime is your goal in this
kind of system, because you want subsecond response times. Here the SQL
profile makes more sense: you gather the statistics for the query once
and persist them for all subsequent optimizations of that query.
More Information
In addition to the Oracle documentation at otn.oracle.com/documentation, Jonathan Lewis’ excellent book Cost-Based Oracle Fundamentals (Apress, 2006) and my book Effective Oracle by Design (Oracle Press, 2003) provide more information on dynamic sampling and other optimizer topics.