Archive for June, 2015

Good new SQL tuning book

June 24th, 2015

People ask me from time to time what are some good SQL tuning books. It’s a tough question to answer. There are some seminal books such as “SQL Tuning” by Dan Tow and “Cost Based Optimizer Fundamentals” by Jonathan Lewis, but these are advanced books  that few I know have actually read from cover to cover. If you are looking for practical SQL tuning cookbooks you might want something a little less advanced.  For a practical approach, I love “Refactoring SQL Applications” by  Stephane Faroult which is an easy read, again, it’s not a straight forward SQL tuning book.

Finally there is a book that provides a pragmatic down to earth approach to tuning SQL queries : “Oracle Performance Tuning and Optimization : It’s all about the Cardinalities” by Kevin Meade.

I’ve been meaning to put together a general review but as that’s a lot to tackle I thought I’d go through some of the material in the book chapter by chapter.

Chapter 1 lays out “Driving Table and Join Order”

Meade lays out the 4 parts of a query as

  1. driving table
  2. join order
  3. access method
  4. join method

This is a great way to look at query optimization. As I’ve written about before with Visual SQL Tuning, based on Dan Tow’s work, the join order is the most important part of query optimization and the first step in join order is finding the driving table. The goal is

“Remove as many rows as possible as early as possible in query execution”

Big question is how do we remove as many row as possible as early as possible in the execution? We do that by finding the best driving table. The best driving table is the table who has the predicate filter that eliminates the most rows. To find the best predicate filter we have to go through the query and find which tables have predicate filters and then determine how many rows are returned after the predicate filter verses the rows in the table. Calculating these filter ratios can be a good bit of busy work. I like my busy work automated, so when I was at Embarcadero, I worked with our DB Optimizer tool team to do this automatically. It was pretty cool. Below is a diagram produced from a query that Jonathan Lewis put together for an article on how to  analyze a query visually.



The blue numbers to the  bottom right of certain tables are the filter ratios. A low ratio means that a low percentage of the table is returned after applying the predicate filter.

How do you calculate these predicate filters in an automated way without DB Optimizer. One of my favorite things about Chapter 1 is a query that Kevin Meade wrote to automatically calculate predicate filter ratios.

For example if I run

SQL> explain plan for
SELECT /*+ gather_plan_statistics */ order_line_data
FROM           customers cus
    INNER JOIN orders ord ON ord.id_customer =
    INNER JOIN order_lines orl ON orl.id_order =
    INNER JOIN products prd1 ON = orl.id_product
    INNER JOIN suppliers sup1 ON = prd1.id_supplier
WHERE   cus.location = 'LONDON'
    AND ord.date_placed BETWEEN sysdate - 7
                        AND     sysdate
    AND sup1.location = 'LEEDS'
                 FROM  alternatives    alt
                       INNER JOIN     products prd2
                         ON = alt.id_product_sub
                       INNER JOIN     suppliers sup2
                         ON = prd2.id_supplier
                 WHERE    alt.id_product =
                       AND sup2.location != 'LEEDS' )

Then I run

SQL> @showplanfrpspreadsheetcode11g.sql

I’ll see some output like

      frp_data as (
                    select '   11' id,'DELPHIXDB' table_owner,'PRODUCTS' table_name,'PRD2' table_alias,1999 num_rows,count(*) rowcount,1999 cardinality,cast(null as number) filtered_cardinality from DELPHIXDB.PRODUCTS PRD2 union all
                    select '   17' id,'DELPHIXDB' table_owner,'CUSTOMERS' table_name,'CUS' table_alias,14576 num_rows,count(*) rowcount,49 cardinality,count(case when "CUS"."LOCATION"='LONDON' then 1 end) filtered_cardinality from DELPHIXDB.CUSTOMERS CUS union all
                    select '    2' id,'DELPHIXDB' table_owner,'SUPPLIERS' table_name,'SUP1' table_alias,99 num_rows,count(*) rowcount,50 cardinality,count(case when "SUP1"."LOCATION"='LEEDS' then 1 end) filtered_cardinality from DELPHIXDB.SUPPLIERS SUP1 union all
                    select '    8' id,'DELPHIXDB' table_owner,'SUPPLIERS' table_name,'SUP2' table_alias,99 num_rows,count(*) rowcount,50 cardinality,count(case when "SUP2"."LOCATION"<>'LEEDS' then 1 end) filtered_cardinality from DELPHIXDB.SUPPLIERS SUP2 union all
                    select '   14' id,'DELPHIXDB' table_owner,'PRODUCTS' table_name,'PRD1' table_alias,1999 num_rows,count(*) rowcount,1999 cardinality,cast(null as number) filtered_cardinality from DELPHIXDB.PRODUCTS PRD1 union all
                    select '   18' id,'DELPHIXDB' table_owner,'ORDERS' table_name,'ORD' table_alias,71604 num_rows,count(*) rowcount,17898 cardinality,count(case when "ORD"."DATE_PLACED">=SYSDATE@!-7 AND "ORD"."DATE_PLACED"<=SYSDATE@! then 1 end) filtered_cardinality from DELPHIXDB.ORDERS ORD union all
                    select '    9' id,'DELPHIXDB' table_owner,'SUPPLIERS' table_name,'SUP2' table_alias,99 num_rows,count(*) rowcount,99 cardinality,cast(null as number) filtered_cardinality from DELPHIXDB.SUPPLIERS SUP2 union all
                    select '   12' id,'DELPHIXDB' table_owner,'ALTERNATIVES' table_name,'ALT' table_alias,17900 num_rows,count(*) rowcount,17900 cardinality,cast(null as number) filtered_cardinality from DELPHIXDB.ALTERNATIVES ALT union all
                    select '   19' id,'DELPHIXDB' table_owner,'ORDER_LINES' table_name,'ORL' table_alias,286416 num_rows,count(*) rowcount,286416 cardinality,cast(null as number) filtered_cardinality from DELPHIXDB.ORDER_LINES ORL union all
                    select null,null,null,null,null,null,null,null from dual
select frp_data.*,round(frp_data.filtered_cardinality/case when frp_data.rowcount = 0 then cast(null as number) else frp_data.rowcount end*100,1) actual_frp,decode(frp_data.filtered_cardinality,null,cast(null as number),round(frp_data.cardinality/case when frp_data.num_rows = 0 then cast(null as number) else frp_data.num_rows end*100,1)) plan_frp
from frp_data
where id is not null
order by

If I spool this to a file like kmo.sql (Kevin Meade out ) and run it I’ll get the filter ratios

----- --------- ------------ ---- ---------- ---------- ----------- -------------------- ---------- --------
    2 DELPHIXDB SUPPLIERS    SUP1	  99	     99 	 50		      49       49.5	50.5
    8 DELPHIXDB SUPPLIERS    SUP2	  99	     99 	 50		      50       50.5	50.5
   11 DELPHIXDB PRODUCTS     PRD2	1999	   1999        1999
   12 DELPHIXDB ALTERNATIVES ALT       17900	  17900       17900
   14 DELPHIXDB PRODUCTS     PRD1	1999	   1999        1999
   17 DELPHIXDB CUSTOMERS    CUS       14576	  14576 	 49		      49	 .3	  .3
   18 DELPHIXDB ORDERS	     ORD       71604	  71604       17898		    8941       12.5	25.0
   19 DELPHIXDB ORDER_LINES  ORL      286416	 286416      286416

  • NUM_ROWS – rows from table statistics
  • ROWCOUT – actual count(*)
  • CARDINALITY –  optimizer expected cardinality
  • ACTUAL_FRP – actual filter ratio
  • PLAN_FRP – expected filter ratio

From here I can see that the best filter ratio is on Customers and that’s where I should start my query.

Chapter 1 of Kevin Meade’s book is available online here.

A full set of scripts from his book are available here scripts.rar 

A short word doc on tuning from Kevin along with some examples is available at information needed in Tuning a SQL Query.docx 

And here is a thread on Orafaq from Kevin.


Performance #datachat

June 19th, 2015




photo by Tom Woodward

 Summary of responses to questions from Solarwinds #datachat back in the fall.

1. Why do performance issues happen?

Performance issues happen mainly because of bad design.  Performance issues occasionally happen because of hardware undersizing (IO, CPU, Memory, Network) but that’s the 20 (or less) of 80/20.
At  IOUG a last year someone tweeted out a slide that said 80% of performance problems were from IO and  Kevin Closson tweeted the other 80% are from bad SQL 😛  . I thought that was pretty funny. In other words bad SQL , which is part of bad design, is the cause of most performance problems and not I/O per say.
Performance design problems arise primarily due to unnecessary work . As I once heard Cary Millsap say “The fastest query is the query that doesn’t have to be run.”

#1 problem is impedance between Object Oriented design as well as functional programming verse relational set operation paradigm

My top 5 perf issues

1. Entity Attribute Value schema, aka the stuff table.

One of the worst performance designs is Entity Attribute Value schemas, aka stuff table, which is ironically the design of many noSQL implementations .

2. single row operations

Using databases without leveraging set operations leads to single row inserts followed by commits.
Single row operations are like going to the grocery store buying one item, coming home, then going back

 When fetching data, its much more efficient to multiple rows at a time in an array interface rather than single row
“Refactoring SQL Applications” by Stephane Faroult is great for understanding and fixing bad database code access design
flip-side corollary: batching too much, collecting hour of data before inserting thus application data is up to hour out of date

3. joining tables in code instead of SQL

3. One of the most disturbing performance issues is coders joining tables in application code instead of the database

4. using count(*) to define loop iterations or array sizes

using count(*) to define things like how make loops to make instead of just looping over a cursor

running count(*) to find out how many iterations to make in a loop
running count(*) to find out how big an array to make

5. flooding connection pools with new sessions when the application perceives a slowdown

when using a connection pool, increasing the sessions when performance goes down which usually makes performance worse

2. What is the number one performance tuning myth?

Probably the greatest performance myth is that there are silver bullets to fix your problem like Exadata instead of good design
As Cary Milsap once said, the most common problem is people looking for the most common problem instead of measuring for it
myth: It’s such a beaten dead horse but it still comes to mind having a good buffer cache hit ratio means I/O performance is good
myth: better yet that BCHR is the myth that adding CPU, faster disks and hardware will make performance scale

 3. What do you look for first when performance tuning?

According to the theory of constraints “Any improvement not made at the constraint is an illusion.”
The first step in tuning is to find the constraint. To find the constraint you need to find where the most time is being spent

As Kellyn Pot’vin said, “if your not tuning for time you are waisting time”
Graphical tools that visualize quantitative information are the most powerful way to find the constraints in Oracle
To find the constraint in Oracle, I use Average Active Session load in Top Activity Screen in OEM or a tool like Confio Ignite
To find the constraint without Graphic tools look at Top 5 Timed Events in AWR and/or run ashrpt.sql

4. What 12c features do you see as having the greatest performance impact?

EM Express where developers can actually see the impact of their code
It is not fair when a DBA sees a performance problem in OEM top activity and then yells at a developer who is flying blind
EM Express can give developers safe visibility into performance impact and load their code puts on the system
Other things I’m looking forward to in 12c are Real Time ADDM, Compare period ADDM and ASH Analytics

5. How does virtualization impact performance tuning?

Virtualization massively helps performance at the level of a company
RealGeneKim, author of “The Phoenix Project” points out, provisioning environments is the biggest constraint in IT today
Highly recommend “The Phoenix Project” to DBAs and everyone in IT to understand finding the constraints and optimizing them
For provisioning environments, VMware, Puppet, Chef etc have alleviated enormous constraints on provisioning environments

The remaining major constraint in IT is provisioning copies of databases and data for environments
Delphix eliminates the enormous bureaucracy and resource constraints of provisioning database and data copies by virtualizing data

Performance of the company as a whole is orders of magnitude more important the performance impact of virtualization on a database
Happy to say though that in my experience VMware overhead at a database performance level is usually negligible
On the other hand, since multiple VMs can run on the same shared hardware resource, other VMs can cause problems
Finding performance problems due to other VMs or the saturation of resources on ESX can be difficult
Confio’s tools that provide direct insight into VM performance correlated with Oracle performance are exciting


 6. How often do poor configuration options lead to poor performance?

Configuration performance problems are pretty common but usually get ironed out quickly by competent DBAs
Typical configuration issues: redo logs too small, PGA too small , Parallel query over configured, SQL plans changing