Search This Blog

Showing posts with label Explain plan. Show all posts
Showing posts with label Explain plan. Show all posts

Thursday, August 2, 2012

Understanding Oracle QUERY PLAN - Part2 (Exploring SQL Joins)

This is the second part of the article Understanding Oracle Query Plan. In this part we will deal with SQL Joins.
For the first part of this article, click here
This time we will explore and try to understand query plan for joins. Let’s take on joining of two tables and let’s find out how Oracle query plan changes. We will start with two tables as following,
Product Table
- Stores 15000 products. Each product has unique numeric id.
Buyer Table
- Stores 15,000,00 buyers who buy the above products. This table has unique id field as well as a prodid (product id) field that links back to the product table.
Before we start, please note, we do not have any index or table statistics present for these tables.

SORT MERGE JOIN

  
SQL> explain plan for SELECT *
  2  FROM PRODUCT, BUYER
  3  WHERE PRODUCT.ID = BUYER.PRODID;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
---------------------------------------------

---------------------------------------
| Id  | Operation           | Name    |
---------------------------------------
|   0 | SELECT STATEMENT    |         |
|   1 |  MERGE JOIN         |         |
|   2 |   SORT JOIN         |         |
|   3 |    TABLE ACCESS FULL| BUYER   |
|*  4 |   SORT JOIN         |         |
|   5 |    TABLE ACCESS FULL| PRODUCT |
---------------------------------------

Above plan tells us that CBO is opting for a Sort Merge join. In this type of joins, both tables are read individually and then sorted based on the join predicate and after that sorted results are merged together (joined).
  
Read Product ---> Sort by product id ------|
                                           |---> join
Read Buyer   ---> Sort by product id ------|
Joins are always a serial operation even though individual table access can be parallel.
Now let’s create some statistics for these tables and let’s check if CBO does something else than SORT MERGE join.

HASH JOIN

SQL> analyze table product compute statistics;

Table analyzed.

SQL> analyze table buyer compute statistics;

Table analyzed.

SQL> explain plan for SELECT *
  2  FROM PRODUCT, BUYER
  3  WHERE PRODUCT.ID = BUYER.PRODID;

Explained.
 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------
Plan hash value: 2830850455

------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes |
------------------------------------------------------
|   0 | SELECT STATEMENT   |         | 25369 |  2279K|
|*  1 |  HASH JOIN         |         | 25369 |  2279K|
|   2 |   TABLE ACCESS FULL| PRODUCT | 15856 |  1254K|
|   3 |   TABLE ACCESS FULL| BUYER   |   159K|  1718K|
------------------------------------------------------
  
CBO chooses to use Hash join instead of SMJ once the tables are analyzed and CBO has enough statistics. Hash join is a comparatively new join algorithm which is theoretically more efficient than other types of joins. In hash join, Oracle chooses the smaller table to create an intermediate hash table and a bitmap. Then the second row source is hashed and checked against the intermediate hash table for matching joins. The bitmap is used to quickly check if the rows are present in hash table. The bitmap is especially handy if the hash table is too huge. Remember only cost based optimizer uses hash join.
Also notice the FTS operation in the above example. This may be avoided if we create some index on both the tables. Watch this,
SQL> create index idx_prod_id on product (id);

Index created.

SQL> create index idx_buyer_prodid on buyer (prodid);

Index created.

SQL>  explain plan for select product.id
  2  FROM PRODUCT, BUYER
  3  WHERE PRODUCT.ID = BUYER.PRODID;

Explained.

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------

------------------------------------------------------------------
| Id  | Operation             | Name             | Rows  | Bytes |
------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                  | 25369 |   198K|
|*  1 |  HASH JOIN            |                  | 25369 |   198K|
|   2 |   INDEX FAST FULL SCAN| IDX_PROD_ID      | 15856 | 63424 |
|   3 |   INDEX FAST FULL SCAN| IDX_BUYER_PRODID |   159K|   624K|
------------------------------------------------------------------

NESTED LOOP JOIN

There is yet another kind of joins called Nested Loop Join. In this kind of joins, each record from one source is probed against all the records of the other source. The performance of nested loop join depends heavily on the number of records returned from the first source. If the first source returns more record, that means there will be more probing on the second table. If the first source returns less record, that means, there will be less probing on the second table.
To show a nested loop, let’s introduce one more table. We will just copy the product table into a new table, product_new. All these tables will have index.
Now I write a simple query below,
  
select  *
from buyer, product, product_new
where buyer.prodid=product.id
and buyer.prodid = product_new.id;
And then I checked the plan. But the plan shows a HASH JOIN condition and not a NESTED LOOP. This is, in fact, expected because as discussed earlier hash-join is more efficient compared to other joins. But remember hash join is only used for cost based optimizer. So if I force Oracle to use rule based optimizer, I might be able to see nested joins. I can do that by using a query hint. Watch this,
SQL> explain plan for
  2   select /*+ RULE */ *
  3  from buyer, product, product_new
  4  where buyer.prodid=product.id
  5  and buyer.prodid = product_new.id;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------
Plan hash value: 3711554028

-----------------------------------------------------------
| Id  | Operation                      | Name             |
-----------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |
|   1 |  TABLE ACCESS BY INDEX ROWID   | PRODUCT          |
|   2 |   NESTED LOOPS                 |                  |
|   3 |    NESTED LOOPS                |                  |
|   4 |     TABLE ACCESS FULL          | PRODUCT_NEW      |
|   5 |     TABLE ACCESS BY INDEX ROWID| BUYER            |
|*  6 |      INDEX RANGE SCAN          | IDX_BUYER_PRODID |
|*  7 |    INDEX RANGE SCAN            | IDX_PROD_ID      |
-----------------------------------------------------------
Voila! I got nested loops! As you see, this time I have forced Oracle to use rule based optimizer by providing /*+ RULE */ hint. So Oracle has now no option but to use nested loops. As apparent from the plan, Oracle performs a full scan of product_new and index scans for other tables. First it joins buyer with product_new by feeding each row of buyer to product_new and then it sends the result set to probe against product.
Ok, with this I will conclude this article. The main purpose of this article and the earlier one was to make you familiar on Oracle query execution plans. Please keep all these ideas in mind because in my next article I will show how we can use this knowledge to better tune our SQL Queries. Stay tuned.

Understanding Oracle QUERY PLAN - A 10 minutes guide

Confused about how to understand Oracle Query Execution Plan? This 10 minutes step by step primer is part of a two part article that will teach you all the right things that you must know about it.

What is Query Execution Plan?

When you fire an SQL query to Oracle, Oracle first comes up with a query execution plan in order to fetch the desired data from the physical tables. This query execution plan is crucial as different execution plan take different time for the query to execute.
Oracle Query Execution Plan actually depends on the choice of Oracle optimizer – Rule based (RBO) Or Cost based (CBO) Optimizer. For Oracle 10g, CBO is the default optimizer. Cost Based optimizer enforces Oracle to generate the optimization plan by taking all the related table statistics into consideration. On the other hand, RBO uses a fixed set of pre-defined rules to generate the query plan. Obviously such fixed set of rules might not always be accurate to come up with most efficient plan, as actual plan depends a lot on the nature and volume of tables’ data.

Understanding Oracle Query Execution Plan

But this article is not for comparing RBO and CBO (In fact, there is not much point in comparing these two). This article will briefly help you understand,
  1. How can we see Query Execution Plan
  2. How do we understand (or rather interpret) the execution plan.
So let’s begin. I will be using Oracle 10g server and SQL *Plus client to demonstrate all the details.

Oracle Full Table Scan (FTS)

Let’s start by creating a simple product table with the following structure,
ID number(10)
NAME varchar2(100)
DESCRIPTION varchar2(255)
SERVICE varchar2(30)
PART_NUM varchar2(50)
LOAD_DATE date
Next I will insert 15,000 records into this newly created table (data taken from one of my existing product table from one of my client’s production environment).
Remember, currently there is no index on the table.
So we start our journey by writing a simple select statement on this table as below,
  

SQL> explain plan for select * from product;
Explained.


SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------
Plan hash value: 3917577207
-------------------------------------
| Id  | Operation          | Name   |
-------------------------------------
|   0 | SELECT STATEMENT   |        |
|   1 |  TABLE ACCESS FULL | PRODUCT|
-------------------------------------

Note
-----
   - rule based optimizer used (consider using cbo)

Notice that optimizer has decided to use RBO instead of CBO as Oracle does not have any statistics for this table. Let’s now build some statistics for this table by issuing the following command,
SQL> Analyze table product compute statistics;
Now let’s do the same experiment once again,
  
SQL> explain plan for select * from product;
Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------
Plan hash value: 3917577207
-----------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes |
-----------------------------------------------------
|   0 | SELECT STATEMENT   |         | 15856 |  1254K|
|   1 |  TABLE ACCESS FULL | PRODUCT | 15856 |  1254K|
-----------------------------------------------------

You can easily see that this time optimizer has used Cost Based Optimizer (CBO) and has also detailed some additional information (e.g. Rows etc.)
The point to note here is, Oracle is reading the whole table (denoted by TABLE ACCESS FULL) which is very obvious because the select * statement that is being fired is trying to read everything. So, there’s nothing interesting up to this point.

Index Unique Scan

Now let’s add a WHERE clause in the query and also create some additional indexes on the table.
SQL> create unique index idx_prod_id on product (id) compute statistics;

Index created.

SQL>  explain plan for select id from product where id = 100;

Explained.

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
---------------------------------------------------------
Plan hash value: 2424962071

---------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes |
---------------------------------------------------------
|   0 | SELECT STATEMENT   |             |     1 |     4 |
|*  1 |  INDEX UNIQUE SCAN | IDX_PROD_ID |     1 |     4 |
---------------------------------------------------------

So the above statement indicates that CBO is performing Index Unique Scan. This means, in order to fetch the id value as requested, Oracle is actually reading the index only and not the whole table. Of course this will be faster than FULL TABLE ACCESS operation shown earlier.

Table Access by Index RowID

Searching the index is a fast and an efficient operation for Oracle and when Oracle finds the desired value it is looking for (in this case id=100), it can also find out the rowid of the record in product table that has id=100. Oracle can then use this rowid to fetch further information if requested in query. See below,
SQL> explain plan for select * from product where id = 100;

Explained.

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------
Plan hash value: 3995597785

----------------------------------------------------------
| Id  | Operation                   | Name   |Rows | Bytes|
----------------------------------------------------------
|   0 | SELECT STATEMENT            |        |   1 |   81 |
|   1 |  TABLE ACCESS BY INDEX ROWID| PRODUCT|   1 |   81 |
|*  2 |   INDEX UNIQUE SCAN         | IDX_PROD_ID | 1 |   |
----------------------------------------------------------

TABLE ACCESS BY INDEX ROWID is the interesting part to check here. Since now we have specified select * for id=100, so Oracle first use the index to obtain the rowid of the record. And then it selects all the columns by the rowid.

Index Range Scan

But what if we specify a >, or between criteria in the WERE clause instead of equality condition? Like below,

  
SQL> explain plan for select id from product where id <10 explained.="explained." sql="sql"> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
---------------------------------------------
Plan hash value: 1288034875

-------------------------------------------------------
| Id  | Operation        | Name        | Rows  | Bytes |
-------------------------------------------------------
|   0 | SELECT STATEMENT |             |     7 |    28 |
|*  1 |  INDEX RANGE SCAN| IDX_PROD_ID |     7 |    28 |
-------------------------------------------------------

So this time CBO goes for an Index Range Scan instead of INDEX UNIQUE SCAN. The same thing will normally happen if we use a between clause also.

Index Fast Full Scan

Now, let’s see another interesting aspect of INDEX scan here by just altering the “ 10”. Before we see the outcome, just remind yourself that there are 15000 over products with their ids starting from 1 to 15000+. So if we write “10” we are likely to get almost 14990+ records in return. So does Oracle go for an INDEX RANGE SCAN in this case? Let’s see,
SQL> explain plan for select id from product where id>10;

Explained.

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------
Plan hash value: 2179322443

--------------------------------------------------------
| Id  | Operation            | Name       | Rows |Bytes |
--------------------------------------------------------
|   0 | SELECT STATEMENT     |            | 15849|63396 |
|*  1 |  INDEX FAST FULL SCAN| IDX_PROD_ID| 15849|63396 |
---------------------------------------------------------

So, Oracle is actually using a INDEX FAST FULL SCAN to “quickly” scan through the index and return the records from table. This scan is "quick" because unlike index full scan or index unique scan, INDEX FAST FULL SCAN utilizes multiple-block input-output (I/O) whereas the formers utilizes single block I/O.

Note on QUERY PLAN

FTS or Full Table Scan
  • Whole table is read upto high water mark
  • Uses multiblock input/output
  • Buffer from FTS operation is stored in LRU end of buffer cache
Index Unique Scan
  • Single block input/output
Index Fast Full Scan
  • Multi block i/o possible
  • Returned rows may not be in sorted order
Index Full Scan
  • Single block i/o
  • Returned rows generally will be in sorted order
So I think we covered the basics of simple SELECT queries running on a single table. In the next article, We will move forward to understand - that is, we will know how to read the query plan when there is join between tables.