Archive for the ‘Oracle’ Category

Full table scan runs way slower today!

May 7th, 2014

Every so often a DSS query that usually takes 10 minutes ends up taking over an hour.  (or one that takes an hour never seems to finish)

Why would this happen?

When investigating the DSS query, perhaps with wait event tracing,  one finds that the query which is doing full table scans and should be doing large multi-block reads and waiting for “db file scattered read” is instead waiting for single block reads, ie “db file sequential read”.  What the heck is going on?

Sequential reads during a  full table scan scattered read query is a classic sign of reading rollback and reading rollback can make that minute(s) full table scan take hours.

What can happen especially after over night jobs, is that if an overnight job fails to finished before the DSS query is run and if that overnight job  does massive updates without committing till then end, then the DSS query will have to rollback any changes made by the updates to the tables the DSS query is accessing.

How do we quickly identify if this our issue?

ASH is good at identify it. On the other hand it’s often impractical to whip up from scratch an ASH query and that’s where ashmasters on Github comes in. This ASH query and others are on Github under ashmasters.


For this case specifically see:

Here is the output (slight different format than in the github repository) of a query I used in my Oracle Performance classes

AAS SQL_ID           %  OBJ              TABLESPACE
----- -------------  ---  ---------------  ----------
  .18 0yas01u2p9ch4    6  ITEM_PRODUCT_IX  SOEINDEX
                       6  ORDER_ITEMS_UK   SOEINDEX
                      88  ITEM_ORDER_IX    SOEINDEX
  .32 6v6gm0fd1rgrz    6  MY_BIG_Table     SOEDATA
                      94  UNDO             UNDOTBS1

i.e. 95% of the second SQL_ID’s i/o was coming from UNDO. The reads will be single block reads and tremendously slow down the full table scans.

Time business concept.

Oracle, sql

Right Deep, Left Deep and Bushy Joins in SQL

May 5th, 2014

What is right deep verses left deep? Good question. In join trees (not VST) the object on the left is acted upon first then the object on the right.  Below are left deep and right deep examples of the same query, showing

  • query text
  • join tree
  • join tree  modified to more clearly show actions
  • VST showing the same actions

left_right_deep copy

All  of this boils down to the point that a right deep HJ can return rows earlier than a left deep HJ. A left deep HJ has to wait for each join to finished completely so that the result set can be hashed before  the next step can probe it. On the other hand, in a right deep HJ, it’s not the result sets that are being hashed, but a table at each level, thus each table can be hashed without waiting for intermediary results and once these hashes are complete  a probed row can flow all the way through the join tree,  from bottom to top, similar to how a nested loop can start giving results early.  The Left Deep HJs only have two open work areas at a time where as the Right Deep can have multiple work areas open.  One of the key thing to keep in mind is how much data we have to hash. If the intermediate result sets are large (and/or growing each level) then that represents more work each step of the way.

Normally I don’t think in left deep verses right deep because it doesn’t change the the path through the join tree. On the other hand it does change whether we are hashing a result set or we are hashing a table. If hashing a table then the results can be returned, in theory, more quickly.

For NL joins there are only left deep join. The object on the left always probes the object on the right.  The object on the left is always accessed first ( there is no need to modify the object on the left  first and probe from the right with NL).

 download (1)

Besides left deep and right deep, there are  also bushy joins which Oracle doesn’t do unless forces to through sub-queries or views as in the following:

download (2)

Finally,  the above VST diagrams were modified to be more easily compared to the join tree diagrams. Below are the VST diagrams as displayed by default in Embarcadero’s DB Optimizer.   DB Optimizer shows the tables all in a line because the tables are one to one relationships.  Keeping the VST diagram layout constant, it is easier to see the differences in execution paths:


Note on hinting and order of tables in explain plan:

For NESTED LOOPS and HASH JOINS the hint takes one argument, which is the second table that shows up in the explain plan.

For NESTED LOOPS the second table is the table probed into i.e. the second in order of access.

For HASH Joins the second table is the table doing the probing into the hash result set. Hashing is the first operation which creates the hash result set that the second table probes into.

For NESTED LOOPS  if order is “LEADING(X,Y) then  nested loops hint can only be on Yie USE_NL(Y)

For HASH JOINS if the order is “LEADING(X,Y) then the hash join hint can only be on Y, ie USE_HASH(Y)

Screen Shot 2014-05-06 at 6.51.02 PM

A couple of good references on USE_NL and USE_HASH

For more information see:

graphics, sql

Performance live discussion on twitter, 12pm PST Tue April 15

April 14th, 2014


 The tweets from the datachat are available at

in chronological order.


Confio software is hosting a live discussion on twitter tomorrow Tuesday April 15 at 12pm PST on the subject of Oracle performance.

I’ll be online answering performance questions and have invited many other friends to participate. Some friends who’ve said they’ll be there are

Participation and tracking of the discussion can accomplished by either posting with and  following along with the #datachat  hashtag.

Get on TweetDeck or your favorite Twitter tool, search #datachat, add a column and you’ll see our chat appear. Always include #datachat in your tweet so you’ll be part of the conversation.

For an example of a previous #datachat check out Confio’s hosting of  Pete Finnigan on  the subject of security.



photo by elod beregszaszi


Oracle SQL*Net Wait Events

March 24th, 2014



Unfortunately, what Oracle calls “Network Waits” most often have little to do with Network but and almost exclusively to do with the time it takes to pack messages for the network before they are sent.
Client = you, the tool, sqlplus, application
Not the client, the other side = the shadow process is communicating to the client

Of the three waits, only “more data” is possibly related to network issues and that’s not even clear, the other two are simply the time it takes to pack a message before sending it.

SQL*Net message to client – time to pack a message (no network time included) possibly tune SDU
SQL*Net more data from client – possible network issues, possibly tune SDU
SQL*Net more data to client – time to pack a message (no network time included) possibly tune SDU


 The same events exist, but where the client is the shadow process and another database plays the roll of shadow process:


SQL*Net message to dblink 
SQL*Net more data from dblink – possible network issues, possibly tune SDU
SQL*Net more data to dblink 


SQL*Net Wait Events


SQL*Net message from client

Idle Event
Waiting for work from Client
Includes network transmission times for messages coming from shadow

Typically indicative of Client “think time” or “processing time”
Example from Egor Starostin,
From a 10046 trace
   PARSING IN CURSOR #1 len=43 dep=0 uid=0 oct=3 lid=0 tim=1304096237
    hv=2707617103 ad=’89a03e18′
    select * from all_objects where rownum < 20
    PARSE #1:c=0,e=143,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1304096209
    EXEC #1:c=0,e=744,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1304097036
    WAIT #1: nam=’SQL*Net message to client’ ela= 3 driver id=1650815232
    #bytes=1 p3=0 obj#=-1 tim=1304097096
    FETCH #1:c=10000,e=6903,p=0,cr=9,cu=0,mis=0,r=1,dep=0,og=1,tim=1304104057
1->WAIT #1: nam=’SQL*Net message from client‘ ela= 721 driver
    id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1304104865        # [non-idle]
    WAIT #1: nam=’SQL*Net message to client’ ela= 1 driver id=1650815232
    #bytes=1 p3=0 obj#=-1 tim=1304105319
    FETCH #1:c=0,e=627,p=0,cr=21,cu=0,mis=0,r=15,dep=0,og=1,tim=1304105524
2->WAIT #1: nam=’SQL*Net message from client‘ ela= 253 driver
    id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1304105818        # [non-idle]
    WAIT #1: nam=’SQL*Net message to client’ ela= 1 driver id=1650815232
    #bytes=1 p3=0 obj#=-1 tim=1304105867
    FETCH #1:c=0,e=63,p=0,cr=6,cu=0,mis=0,r=3,dep=0,og=1,tim=1304105900
3->WAIT #1: nam=’SQL*Net message from client‘ ela= 1960753 driver
    id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1306066946 # [idle]
    PARSING IN CURSOR #1 len=21 dep=0 uid=0 oct=3 lid=0 tim=1306069444
    hv=2200891488 ad=’89913b50′
    select user from dual
    PARSE #1:c=0,e=60,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1306069440
The first two “SQL*Net message from client’ are in the middle of cursor processing and are considered non-idle waits.
The third “SQL*Net message from client” is between cursors and considered an idle event, ie we are waiting for the next command from the client.


SQL*Net message to client

Time it takes to pack a message to be sent to the client
Doesn’t include network timing
see Tanel Poder’s analysis of SQL*Net message to client


SQL*Net more data to client

Same as SQL*Net message to client except this is for data that spans SDU packets.

Wait represents the time it takes to pack data.
Doesn’t include network timing


SQL*Net more data from client

The only SQL*Net wait that can indicate a possible NETWORK problem
Client is sending data to shadow that spans packets (think large data inserts, possibly large code blocks, large SQL statements)
Shadow waits for next packet.
Can indicate network latency.
Can indicate a problem with the client tool
Here is an example with ASHMON where the application server died mid-stream on inserts. The shadow processes were left waiting for completion of the message. You can see the regular load on the database on the left, then just past the middle the load crashes, and all that’s left is waits on “SQL*Net more data from client”

Possibly set SDU=32768 as well as setting RECV_BUF_SIZE and SEND_BUF_SIZE to 65536.


SQL*Net break/reset to client

Error in sql statement

Control C
Usually highlights and error in application
            (CONSTRAINT T1_CHECK1 CHECK (C1 IN ('J','N')));
            '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';
Trace File
       PARSING IN CURSOR #2 len=25 dep=0 uid=0 oct=2 lid=0 tim=5009300581224 hv=981683409 ad='8e6a7c10'
       END OF STMT
       PARSE #2:c=0,e=2770,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,tim=5009300581220
       BINDS #2:
       EXEC #2:c=0,e=128,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=5009300581418
       ERROR #2:err=1722 tim=512952379
       WAIT #2: nam='SQL*Net break/reset to client' ela= 31 driver id=1650815232 break?=1 p3=0 obj#=-1 tim=5009300581549
       WAIT #2: nam='SQL*Net break/reset to client' ela= 92 driver id=1650815232 break?=0 p3=0 obj#=-1 tim=5009300581662
Unfortunately Oracle doesn’t give much information about debugging unless you are trace. If you don’t trace, the SQL won’t be captured because from Oracle’s point of view the problem statement isn’t an acceptable SQL statement so there is no SQL ID to track down.



These waits are the same as
SQL*Net message to dblink

SQL*Net more data from dblink
SQL*Net more data to dblink
SQL*Net break/reset to dblink


Analysis and Tuning

There isn’t much to do on the Oracle side for tuning. You can try optimizing the SDU and SEND_BUF_SIZE and RECV_BUF_SIZE.
For actually getting information on network speeds you will have to use something like
  • ping
  • tnsping
  • network sniffer



The default SDU can be set in the sqlnet. ora
If it’s not set, the default is 2048
The max is 32768
The default,or the value in sqlnet.ora, can be overridden in the tnsnames. ora and the listener.ora. The client and server negotiate the size aggreeing on the smaller of the two settings.
(TDU – Transmission Data Unit – see note 44694.1 The TDU parameter has been deprecated in the Oracle Net v8.0 and beyond and is ignored. It is only mentioned here for backward compatibility.)
      V10G = (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = fuji)(PORT = 1522))
) )
       (SID_LIST =
       (SID_DESC =
       (SID_NAME = v10g)
       (ORACLE_HOME = /export/home/oracle10)



       trace_unique_client = true


       client_3582.trc:[12-JAN-2008 11:37:39:237] nsconneg: vsn=313, gbl=0xa01, sdu=32768, tdu=32767

more from Jonathan Lewis at



The recommended size for these buffers (from Oracle’s docs) is at least
Network bandwidth * roundtrip = buffer min size
For example if the network bandwidth is 100mbs and the round trip time (from ping) is 5ms then
           100,000,000 bits   1 byte   5 seconds
           ---------------- x ------ x --------- = 62,500 bytes
            1 second          8 bits     1000
           V10G = (DESCRIPTION =
           (ADDRESS = (PROTOCOL = TCP)(HOST = fuji)(PORT = 1522))
           (CONNECT_DATA =
           (SERVER = DEDICATED) (SERVICE_NAME = v10g)
           ) )
           SID_LIST_LISTENER =
           (SID_LIST =
           (SID_DESC =
           (SID_NAME = v10g)
           (ORACLE_HOME = /export/home/oracle10)


performance, wait events

Oracle : buffer busy wait

March 21st, 2014


Oracle 10 and 11

Buffer Busy Waits usually happen on Oracle 10 and 11 mainly because of insert contention into tables or Indexes.  There are a few other rare cases of contention on old style RBS segments, file headers blocks and freelists.
Before Oracle 10 and 11 there was one other major reason which was readers waiting for readers, ie one user does a phyiscal IO of a block into memory and a second user want to read that block. The second user waits until the IO is finished by the first user. Starting in 10g this wait has been given the name “read by other session“.  Before Oracle 10g this was also a “buffer busy wait”.
The easiest way to analyse the bottleneck and find a solution is to use ASH (active session History) available in Oracle 10g with the diagnostics pack license or using Simulated ASH for free or using a product like DB Optimizer.
Data block class, which can be found in ASH,  is the most important piece of information in analysing buffer busy waits. If we know the block class we can determine what kind of bottleneck:
    1. data block
      • IF OTYPE =
      • INDEX , then the insert index leaf block is probably hot, solutions are
        • Hash partition the index
        • Use reverse key index
      • TABLE, then insert block is hot,solutions
        • Use free lists
        • Put Object in ASSM tablespace
    2. Segment header – If “segment header” occurs at the same time as CLASS= “data block” on the same object and the object is of OTYPE= “TABLE”  then this is just a confirmation that the TABLE needs to use free lists or  ASSM.
    3. File Header Block – Most likely extent allocation problems, look at extent size on tablespace and increase the extent size to there are few extent allocations and less contention on the File Header Block.
    4. free lists – Add free list groups to the object
    5. undo header – Not enough UNDO segments, if using old RBS then switch to AUM
    6. undo block – Hot spot in UNDO, application issue
How do we find the block class? With a quick query on the ASH data like:

       o.object_name obj,
       o.object_type otype,
from v$active_session_history ash,
     ( select rownum class#, class from v$waitstat ) w,
      all_objects o
where event='buffer busy waits'
   and w.class#(+)=ash.p3
   and o.object_id (+)= ash.CURRENT_OBJ#
Order by sample_time;
For Example


------ ------ ------------- ------------------
TOTO1  TABLE  8gz51m9hg5yuf data block
TOTO1  TABLE  8gz51m9hg5yuf data block
TOTO1  TABLE  8gz51m9hg5yuf segment header
TOTO1  TABLE  8gz51m9hg5yuf data block
If we find that CLASS=datablock, then we will want more information to diagnose, such as the object type “OTYPE” , object name and what kind of tablespace the object is stored in. The following query provides that information:

set linesize 120

col block_type for a20
col objn for a25
col otype for a15
col filen for 9999
col blockn for 9999999
col obj for a20
col tbs for a10
       nvl(,to_char(bbw.p1)) TBS,
       tbs_defs.assm ASSM
from (
       count(*) cnt,
       nvl(object_name,CURRENT_OBJ#) obj,
       o.object_type otype,
       ash.SQL_ID sql_id,
       nvl(w.class,'usn '||to_char(ceil((ash.p3-18)/2))||' '||
                         0,'block')) block_type,
       --nvl(w.class,to_char(ash.p3)) block_type,
       ash.p1 p1
    from v$active_session_history ash,
        ( select rownum class#, class from v$waitstat ) w,
        all_objects o
    where event='buffer busy waits'
      and w.class#(+)=ash.p3
      and o.object_id (+)= ash.CURRENT_OBJ#
      and ash.session_state='WAITING'
      and ash.sample_time > sysdate - &minutes/(60*24)
      --and w.class# > 18
   group by o.object_name, ash.current_obj#, o.object_type,
         ash.sql_id, w.class, ash.p3, ash.p1
  ) bbw,
    (select   file_id, 
       tablespace_name name
  from dba_data_files
   ) tbs,
 tablespace_name    NAME,
        extent_management  LOCAL,
        allocation_type    EXTENTS,
        segment_space_management ASSM,
     from dba_tablespaces 
   ) tbs_defs
  where tbs.file_id(+) = bbw.p1
Order by bbw.cnt
and the output looks like

  CNT OBJ     OTYPE   SQL_ID        BLOCK_TYPE       TBS        ASSM

----- ------- ------- ------------- ---------------- ---------- ------
    3 TOTO1   TABLE   8gz51m9hg5yuf segment header   NO_ASSM    MANUAL
   59 TOTO1   TABLE   8gz51m9hg5yuf data block       NO_ASSM    MANUAL
Oracle 7, 8 and 9

Before Oracle 10, buffer busy waits also happened because IO blocking another user wanting to do the same IO. On Oracle 9, the main reasons for buffer busy waits are

1)       IO read contention (only Oracle 9i and below)

2)       Insert Block Contention on Tables or Indexes
3)       Rollback Segment Contention

On  7.0  – 8.1.5 see

On version 8 and 9, the p3 value has a different meaning. Instead  of meaning the block type (which is the best thing to know) it means the kind of buffer busy wait. There are only two values that matter to us, values in

100 range = read waits (basically just an IO wait)

Reader blocking Reader, ie one reader is reading a block in and another person wants to read this block and waits on a buffer busy wait p3=130.

200 range = write contetion (same as in 10g)

Writers blocking other writers for example while doing inserts either because of no free lists on the table or because everyone is inserting into the same index block.

If you have set up ASH style collection with S-ASH or have a product like DB Optimizer you can run a query like:


       count(*) cnt,
       o.object_name obj,
       o.object_type otype,
       decode(substr(ash.p3,1,1),1,'read',2,'write',p3) p3
from v$active_session_history ash,
      all_objects o
where event='buffer busy waits'
   and o.object_id (+)= ash.CURRENT_OBJ#
group by o.object_name, o.object_type, ash.sql_id, ash.p3,ash.CURRENT_OBJ#
order by cnt
And see what kind of buffer busy waits there are and what the objects are:


--- ------- ------- ------------ ---------- ------
  1                           -1 1375352856 read
  2                           -1  996767823 read
  2                           -1 2855119862 write
 17                           -1 1375352856 write
 89 TOTO1   TABLE         296030 1212617343 write
109                       296022 1212617343 write

Often the Current_obj# is -1 so we can’t figure out what the object is . There is an alternative method

col block_type for a18

col objn for a25
col otype for a15
col event for a15
col blockn for 999999
col segment_name for a20
col partition_name for a15
col owner for a15
set timing on
drop table myextents;
create table myextents as select * from dba_extents;
        decode(substr(ash.p3,1,1),1,'read',2,'write',p3) p3
from v$active_session_history ash,
     myextents ext
       event = 'buffer busy waits'
   and ( current_obj# = -1 or current_obj#=0  or current_obj# is null )
   --and sample_time > sysdate - &minutes/(60*24)
   --and session_state='WAITING'
   and  ext.file_id(+)=ash.p1 and
        ash.p2 between  ext.block_id and ext.block_id + ext.blocks
group by
Order by count(*)
Because querying DBA_EXTENTS  is a slow operation, I made a copy of DBA_EXTENTS which will be faster to query.


--- ------ -------------- --------------- ------------- --------
  1 SYS    _SYSSMU2$                      TYPE2 UNDO    read
  1 SYS    _SYSSMU3$                      TYPE2 UNDO    write
This second option of getting the object from P1 and P2 (file and block) should probably be done only with the users consent, because we would have to create a copy of the dba_extent table which might take a long time if it’s big.
No ASH ?

If you don’t have ASH data you will have to do some guess work.
Block Class (block type)

The first step in finding out the source of buffer busy waits is looking at

This will tell us what kind of datablocks we have contention on.
File with contention

You can also get an idea of what file contains the object with the buffer busy waits by looking at:

Object with contention

Starting in version 9i there is the table

That will list the objects with buffer busy waits.
If you are on version 7 or 8 good luck finding the object without setting up ASH style data collection.
Why do buffer busy waits happen?

To put it most succinctly, buffer busy waits happen because two users want to change a block at the same time. Two users can change the same block, or even same row “at the same time” ie without committing, but that’s different from the actual operation of modifying the block. The modification on the block in RAM, or computer memory, can only be done by one process at at time in order to avoid memory corruptions. Different users can modify different blocks at the same time but only one user or process can modify a the same block at a time.
In order to really understand what’s going on we have to take a look at how Oracle manages memory and block access and modifications.
Here is the layout of
Above is a diagram shows some of the essential parts of Oracle in regards to performance tuning.
In the machine memory are
  •     Oracle’s SGA, or System Global Area, a memory that is shared between Oracle users
  •     LGWR – log writer process
  •     DBWR – database writer process
  •     User1,2,3 … – user processes, in this case “shadow processes”

On the machine file system are

  • Redo log files
  • Data files
The SGA is composed of (among other things)
  • Log Buffer
  • Library Cache
  • Buffer Cache
What’s important for understanding buffer busy waits is how the buffer cache is managed. Here is view of the buffer cache with more components:
In order to access a block, a user (shadow process) has to get a latch (cache buffer chains latch) which protects buckets or linked lists of buffer headers. Once the header desired if found the latch is released. The buffer headers point to the actual data block in memory. Before modifying a block in memory a user has to lock the buffer header. The buffer header is locked any time a modification is made whether it is reading a block into memory or modifying a block that is already in memory. Usually the header is locked only for a brief amount of time but when there is a lot of concurrent access the buffer header can become a bottleneck.
BBW when readling data – read by other session

A buffer busy can happen on oracle 7,8 and 9 when one user is reading a block into memory and a second user wants to read that block. Instead of the second user trying to read that block into memory as well, they just wait for the first user to finish. Starting in Oracle 10, this kind of wait was renames “read by other session”

BBW on insert
If multiple concurrent users are inserting into a table that doesn’t have free lists or is not in an ASSM tablespace then all users will end up inserting into the same block, the first one on the free list and this block will become the hot block
by adding free lists or moving the table to an ASSM tablespace we will alleviate the bottleneck.
Multiple free lists:
The other option is ASSM or Automatic Segment Space Management which is set at the tablespace level.
In this case free block information is kept in Level 1 BMB (or bitmapped blocks). These Level 1 BMBs are chosen by a hash on the users process ID thus distributing the inserts across the table.
The inserts would look something like this (somewhat exaggerated drawing)
the ASSM BMB blocks take up more space in the table , about 1 extra block for every 16 data blocks and there is overhead first looking in the header/level 3 BMB block then going to the Level 2  then level 1 and finally to the datablock but all in all ASSM is worth reduced costs of management verses free lists.
Identifying and creating ASSM tablespaces
Which tablespaces are ASSM or not?


        extent_management  LOCAL,
        allocation_type    EXTENTS,
        segment_space_management ASSM,
from dba_tablespaces


--------------- ---------- --------- ------
USERS           LOCAL      SYSTEM    AUTO
DATA            LOCAL      SYSTEM    MANUAL
creating an ASSM tablespace:

create tablespace data2 

datafile '/d3/kyle/data2_01.dbf' 
size 200M
segment space management auto;

BBW on index (because of insert)

If users are inserting data that has a rising key value, especially a monotonically rising value, then all the new inserts will have to update the leading edge leaf block of the index and with high concurrent inserts this can cause buffer busy waits.
Hash partition the index
Reverse Key Index

BBW on old style RBS

IF block class > 18 it’s an old style RBS segment

Select  CURRENT_OBJ#||' '||o.object_name objn,

          o.object_type otype,
          CURRENT_FILE# filen,
          CURRENT_BLOCK# blockn,
          w.class ||' '||to_char(ash.p3) block_type
from v$active_session_history ash,
      (select rownum class#, class from v$waitstat ) w,
       all_objects o
where event='buffer busy waits'
    and w.class#(+)=ash.p3
    and o.object_id (+)= ash.CURRENT_OBJ#
Order by sample_time;    


----------- ------ ------ ------ ------------- ------------
54962 TOTO1 TABLE     16   45012 8gz51m9hg5yuf data block 
54962 TOTO1 TABLE     16     161 8gz51m9hg5yuf segment header
0                     14       9 8gz51m9hg5yuf  87
0                     14       9 8gz51m9hg5yuf  87

IF the block is of class > 18, the there will be no object name, so we have to look it up ourselves to be sure:

select  segment_name,  

from     dba_extents 
     &P2  between 
    block_id and block_id + blocks – 1
     file_id = &P1 ;

Plug in 14 for P1 the file # and 9 for P2 the block number:


-------------- --------------
R2             ROLLBACK
move to new AUM or Automatic Undo Mangement
alter system set undo_management=auto  scope=spfile;
BBW on a file header
The ASH data has two different fields that indicate the file # and block # when the wait is a buffer busy wait.
For a buffer busy wait
    File # = p1  *and* File # = current_file#
    Block # = P2  *and* Block # = current_block#
if  p1 != current_file#  or p2 != current_block# then use p1 and p2. They are more reliable.
for example


----- --- --- ---- ----- -- ------ -----------------
11:44 202   2 -1          0      0 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
The real file # is P1 =202 and block # is P2 which is 2
In my database I only had 10 files, so what is this file# 202?!
If you are getting buffer busy waits on the file header block for a tempfile (datafile in a temporary tablespace) then try increasing the “next extent” size in the temporary tablespace.
This wait can happen when lots of extents are being allocated in the temporary tablespace.
What Would ADDM do?
Interstingly enough the ADDM page doesn’t show the new load that has recently come on the system but the analysis is there.  I clicked on the next to bottom line in the page, “Read and write contention on database blocks was consuming significant database time.
Here are the outputs for the different scenarios.
 inserts into a table contention
 inserts into a table with contention on index
RBS contention
File Header Contention


performance, wait events

12c PDBs & Delphix: like chocolate & Peanut Butter

November 7th, 2013



Oracle 12c introduces the new Pluggable Database (PDB) functionality into the Oracle database. What’s the advantage of PDBs? PDBs eliminate the heavy memory overhead of starting up a full Oracle instance requiring a new SGA and full set of background processes. Instead,  we startup one container database (CDB) and then PDBs all share resources of the CDB. With a CDB & PDBs, there is one instance, one set of background processes, one SGA and these are shared among the PDBs. Starting a PDB only requires about 100MB  of memory as opposed to half a GB normally required to start an Oracle instance. Because of this reduced memory cost one can go from running 50  instances of Oracle on 20GB of memory to running 250 PDBs  in that same 20GB of memory. (numbers given by Oracle benchmarks)


One of the main use cases for PDBs is cloning a database for use in development and QA. With PDBs, one can easily give every developer in a development team their own copy of  a database, except for one thing. Each of those PDBs, even though they take up hardly any memory, still require a full set of datafiles. Creation of these datafiles is slow and costly. That’s where Delphix comes in.


With Delphix, having 10 copies of the same database takes up less than the size of the original database! How is that done? It’s done by compressing the original copy and then sharing all the duplicate blocks in that copy among all the clones.


Thus where PDBs give you database for free at the memory level,  Delphix gives you free databases at the disk footprint level and the combination of the two gives you practically free databases!




Oracle marketing slide showing 50 separate databases can run in 20 GB of RAM and 250 PDBs can run in 20 GB of RAM. “5x more scalable”




Screen Shot 2014-04-25 at 8.42.32 AM

Video example of how easy it is with Delphix to link to a PDB in one CDB and the provision it, thin clone, to another CDB on a different machine:

cloning, PDB

EM 12c Snap Clone

September 4th, 2013

Oracle OEM 12c introduces a new feature that enables the creation of Oracle database thin clones by leveraging file system snapshot technologies from either  ZFS  or Netapp.  The OEM adds a graphic interface to the process of making database thin clones. The feature that enables database thin cloning in OEM is called Snap Clone and is part of OEM’s Cloud Control Self Service for data cloning. Snap Clone is available via the feature Database as a Service (DBaaS). Snap clone leverages the copy on write technologies available in some storage systems for database cloning.  Support is initially available for NAS storage and specifically on  ZFS Storage  and NetApp Storage.

In order to use Snap Clone, one has to install the source database such that the source database data files are on a ZFS storage  or Netapp array and have the storage managed by agents on a LINUX machine and then one can thin clone data files on that same storage array.

Snap Clone offers role based access, so storage admin can login in and only have access to areas they are responsible for as well as limiting access to source databases, clones and resource by end users.

Setting Snap Clone


The prerequisites for getting start with Snap Clone are having available storage on ZFS Storage Appliance or Netapp storage array as well as having access to a master test database. A master test databse is a database that has a sanatized version of a production database such that it is either a subset and or  masked. The test master database has to be registered with OEM.  After the test master is registered with OEM, Snap Clone can be setup. To set up snap clone, come into Oracle Cloud Control 12c as “cloud administrator” role with “storage adminstator” priviledge or super administrator and register the storage. To register the storage navigate to “ setup -> provisining patching -> storage registration”.

  • Navigate to “ setup -> provisining patching -> storage registration”
  • Click “Register” tab, and choose storage, either Netapp or ZFS,
    • Supply storage information
      • Name: Storage array name registered in DNS
      • Vendor
      • Protocol: http or https
      • Storage Credentials: credentials for interacting with storage
  • Install agents on a separate LINUX machine to manage the Netapp or ZFS storage. An agent has to run on Linux host to manage the  storage. Supply the
    • Agent host
    • Host credentials
  • Pick a database to make the test master
    • Put the test master on ZFS storage or Netapp storage
    • Register the ZFS storage or Netapp storage with OEM
    • Enable Snap Clone for the  test master database
  • Set up a zone – set max CPU and Memory for a set of hosts and the roles that can see these zones
  • Set up a pool – a pool is a set of machines where databases can be provisioned
  • Set up a profile – a source database that can be used for thin cloning
  • Set up a service template – reference values such as a init.ora for database to be created

Screen Shot 2013-06-09 at 10.08.47 PM

Figure 1. Shows the entry page in OEM 12c Cloud Control. From here go to the top right and choose setup, then provisining patching then storage registration as shown above.

 Navigate to storage registration

To setup Snap Clone navigate to storage registyratiom choose the menus “setup -> provisining patching -> storage registration”.

Screen Shot 2013-06-09 at 10.10.44 PM

Figure 2 Shows a zoom into the menus to choose

Screen Shot 2013-05-31 at 10.09.02 PM

Figure 3. Storage Registration Page

Screen Shot 2013-06-09 at 10.15.04 PM

Figure 4. Choose the type of storage array

Once on the Storage Registration page, choose “Register” and then choose the storage, either Netapp or Sun ZFS.

Register the Storage

Screen Shot 2013-06-09 at 10.17.02 PM

Figure 5. Storage Registration Page

To register the storage supply the following information

  • Name: Storage array name registered in DNS
  • Vendor
  • Protocol: http or https
  • Storage Credentials: credentials for interacting with storage

All of which is documented in cloud administration guide.

Define agents used to manage storage

Then define agents used to manage storage. Agents have are required to run on a LINUX host.  More than one agen can define to provide redundancy.  The agents will be the path by which OEM communicates with the storage. For each agent, supply the following information

  • Host name
  • Credential type
  • Credentials

And finally define the frequency with which the agent synchronizes with the storage to gather the sorage hardware details such as  information on aggregates shares volumes.

After the storage information, agent information and agent synchronization information has been filled out, then hit “submit” button in the top right. Hitting the submit button will return the UI back to the “Storage Registration”. On the “Storage Registration”, click on the storage appliance listed in the top, then click on the contents tab on the bottom half of the page. This will list all the  volumns and aggregates in the storage appliance.

Looking at volumns on Storage Array

Screen Shot 2013-06-09 at 10.18.48 PM

Figure 6. Editing storage ceiling by clicking on a aggregate and hitting the “Edit Storage Config” tab.

For each aggregate one can set storage ceilings. Click on the aggregate or FlexVol and the click “Edit Storage Ceilings” tab.

Choosing a Database Test Master

On the database tab is a list of databases that can be used for cloning. OEM detects the database automatically on the hosts it is managing. OEM will also automatically correlate databases that have storage on the storage array added storage registration.  OEM looks for all databases that have files on the registered storage.  Click on database, then the show files tabs which will show the files and volumes for this database.

Screen Shot 2013-06-09 at 10.20.00 PM

Figure 7. List of files by volumn for database

Screen Shot 2013-06-09 at 10.20.07 PM

Figure 8. Enable Snap Clone for databases that will be used as test masters.

Nominating a database as test master requires enabling snap clone. To enable snap clone for a database, click on the chosen database, then click “Enable Snap Clone” tab just above the list of databases. This will automatically validate that all the volumes are flex clone enabled (in the case of Netapp).


Setting up Zones

The next step is to configure zone which can be used to organize cloud resources

Choose the menu option “Enterprise -> Cloud -> Midelware and Database Home”


Screen Shot 2013-06-09 at 10.23.11 PM

Figure 9. Navigate first to “Cloud -> Middleware and Database Home”

Middleware and Database Cloud page

Screen Shot 2013-06-09 at 10.24.23 PM

Figure 10.  Middleware and Databawe Cloud page

Setting up a Zone

In order to see the zones defined, click on the number next to the title “Paas Infrastructure Zones”  in the top left under General Information.

Screen Shot 2013-06-09 at 10.26.24 PM

Figure 11. PaaS Infrastructure Zones

To create a zone, click the tab “Create”.

Screen Shot 2013-06-09 at 10.27.33 PM

Figure 12. first page of wizard to create a PaaS Infrastructure Zone, give a meaningful name and description of the zone and define maximum CPU utilizaiton and memory allocation.

In the first page of the “PaaS Infrastructure Zone”, give zones a meaningful name and description. Define constraints such as maximum host CPU and memory allocations.

Screen Shot 2013-06-09 at 10.28.56 PM

Figure 13. Second page of the “PaaS Infrastructure Zone” wizard, add hosts that are available to the zone.

Next define hosts that are members of the zone and provide credentials that operate across all members of this zone

Screen Shot 2013-06-09 at 10.30.11 PM

Figure 14. Third page of the “PaaS Infrastructure Zone” wizard, limit which roles can see the zone.

Next define what roles can see and access this szone. The visibiliy of the zone can be limited to a certain class of users via roles like Dev, QA etc

Screen Shot 2013-06-09 at 10.31.18 PM

Figure 15. Final review page for “PaaS Infrastructure Zone” wizard

Finally review settings and click submit

Screen Shot 2013-06-09 at 10.32.29 PM

Figure 16. Showing the Confirmation that the PaaS Infranstructure Zone has been successfully created.


Creating Database Pool and Profiles

The remaining steps required to enable snap clone is to create a database pools which is a collection of servers or nodes that have database software installed.  The remaining part of the setup is done by a differnet user who is the administrator for database as a service.

Log in as  DBAAS_ADMIN.

For the next part navigate to the menu “Setup -> Cloud -> Database”.

Screen Shot 2013-06-09 at 10.33.24 PM

Figure 17. Middleware and Database Cloud page

Screen Shot 2013-06-09 at 10.35.59 PM

Figure 18. Navigate to “Setup -> Cloud -> Database”.

Screen Shot 2013-06-09 at 10.36.56 PM

Figure 19. Database Cloud Self Service Portal Setup. To create a database pool choose the “Create” button in the center of the page and from the pull down, choose “For Database”.

To create a new pool click on the “Create” button in the center of the page, and chose “For Database” from the pull down menu that appears.

Screen Shot 2013-06-09 at 10.38.16 PM

Figure 20. Choose “Create -> For Database”

Screen Shot 2013-06-09 at 10.39.10 PM

Figure 21. Edit pool page. Provide a meaningful name a descrpition of the pool. Add Oracle home directories in the bottom of the page. At the very bottom of the page set a constraint on the number of databases instances that can be created in the pool. On the top right, set the host credentials.


  • Name and description
  • Oracle Home
  • Maximum number of databases per host
  • Credentials

In the “Edit Pool” page, at the top left of the screen, provide a meaningful name and description for the pool. In the middle of the screen add Oracle homes that will be used for databse provisioning.  Every member of a database pool is required to be homogeneous. Homogenous requires that the platform and Oracle version is the same across all the hosts  and Oracle homes in the pool. All the Oracle installations also have to be of the same type either single instance or RAC. In the top right  add the  Oracle home provide oracle credentials and root credentials. Finally at the bottom of the page a constraint can be set on the number of database instances that can be started in this pool.

Screen Shot 2013-06-09 at 10.41.03 PM

Figure 22. Set request limits on the pool

The next page sets the request settings. The first restriction sets how far in advanced can requrest can be made. Second restricts  how long a request can be kept which is the archive retension.  After the archive retention time  the  requests will be deleted.  Finally is the request duration which is the maximum duration for which the request can be made.

Screen Shot 2013-06-09 at 10.42.12 PM

Figure 23. Set memory  and storage quotas per role for the pool. The quotas cover memory, storage, database requests and schema requests.


The above page  configures quotas.  Quota is allocated to each and every self service user. The quotas controls the amount fo resources users  have access to. Quotas are assigned to a role and users inherit quota values from the role. Click “Create” in the middle of the screen.

Screen Shot 2013-06-09 at 10.43.03 PM

Figure 24. Editing the quotas on a pool for a  role.

The popup dialogue has for these entries

  • Role name
  • memory GB
  • storage GB
  • number of database request

Screen Shot 2013-06-09 at 10.43.56 PM

Figure 25. Profiles and Service Templates

Profiles and service templates

.A profile is use to capture information about the source database which can then be used for provisioning.

A service template is a standardized service definition for a database configuration that is offered to the self service users. A collection of service templates forms the  service catalogue. A service template will provision databsae with or  without seed data. To capture an ideal configuration, the easist thing to do is to point at an existing database and fetch information of interest from that database.  The information from the database can be captured using a profile.

To create a profile click on the “Create” button under “Profiles”

Creating a profile

Screen Shot 2013-06-09 at 10.45.07 PM

Figure 26. Specify a reference target for a profile.

Click the magnifying glass to search for a source database.

Screen Shot 2013-06-09 at 10.46.46 PM

Figure 27. Search for a reference target database


Pick a refrence target by clicking on it, the click the “Select” button in bottom right.

Screen Shot 2013-06-09 at 10.47.31 PM

Figure 28. Creating a Database Provisioning Profile

To pick a database for use in thin cloning, choose the check box “Data Content” with suboption selected fro “Structured Data” with sub-option selected for “Create” with sub-option selected for “Storage Snapshot”. This  option is only enabled only when the “enable snapshot” option is enabled on the storage registration page. Disable option capture oracle home.

Provide credentials for the host machines Oracle account and for the database login.


The “Content Option” step is not needed for the above selections.

Screen Shot 2013-06-09 at 10.48.16 PM

Figure 29. Give the profile a meaniful name and description


Next provide credentials for Oracle home and Oracle databse, then provide a meaningful name for the profile  as well as a location. The profile will be userful when creating a service template.

Screen Shot 2013-06-09 at 10.49.12 PM

Figure 30. Review of create profile options.


Next  review the summary and click subit which will connect to storage and take snapshots of the storage

Screen Shot 2013-06-09 at 10.50.46 PM

Figure 31. Shows a zoom into the menus to choose


To  create a new service template choose a profile and  in this case use “thin provisioning for reference DB” profile.  Now to create a new service template click “create” and choose “for database”. Service templates are part of the service catalogue and exposed to the self service users.

Screen Shot 2013-06-09 at 10.51.42 PM

Figure 32. Provide a meaningful name and description for the service template.


Provide a meaningful name and description. For the rest of service template provide information about the databses that will be created from the snapshots such as providing database type, rac or single instance, for rac provide number of nodes. Provide the SID prefix to appended to the SIDs generated for the clones, provide the Domain Name and the port.

Screen Shot 2013-06-09 at 10.52.38 PM

Figure 33. Provide a storage area for writes to use.


The   cloning operation only creates a read only copy thus it is required to provide write space elsewhere in order to allow writing to the thin clone.

click on the edit button

click on volumne, then edit button

Screen Shot 2013-06-09 at 10.53.29 PM

Figure 34. set diretory and maximum space usage for the write location


Provide the mount point prefix and amount of writeable space wish to allocate


Users of the thin clone databses can  also be allowed to take further snapshots. These snapshots can be used as a mechinism to rollback changes, The number of thiese snapshtos can limited just below storage size section:

Screen Shot 2013-06-09 at 10.54.18 PM

Figure 35. set the number of snapshots that can be taken of a a thin clone

Screen Shot 2013-06-09 at 10.55.15 PM

Figure 36. set the initial passwords for database accounts on the thin clone.

next provide credentials for administrative accounts

  • SYS

for all other non-administartive accounts can choose to leave them as is or change them all to one password you can modify certain init.ora parameters for exmaple memory

Screen Shot 2013-06-09 at 10.55.59 PM

Figure 37. Modify any specific init.ora parameters for the thin clone


Screen Shot 2013-06-09 at 10.57.13 PM

Figure 38. Set any pre and post provision scripts to be run at the creation of a thin clone.


custom scripts can be provide as pre or post creation steps this can be very useful if you want to register databses with  OID or certian actions that are specific to your organization

Screen Shot 2013-06-09 at 10.58.05 PM

Figure 39. Set the zone and pool for the thin clone

Screen Shot 2013-06-09 at 10.59.07 PM

Figure 40. set the roles that can use the service template.

you associate this srvice template with a zone and a template this insures that the service template can actualy work on the  group of resources that you have identified and can limit the visibility of the service tempalte usein roles

Screen Shot 2013-06-09 at 11.00.17 PM

Figure 41. review of the service template creation requites

finally we review the summary and click submit

Creating  a Thin Clone

Screen Shot 2013-06-09 at 11.06.14 PM

Figure 42. 12c Cloud Control

Screen Shot 2013-06-09 at 11.07.06 PM

Figure 43. 12c Cloud Control Self Service Portal

Contents of the Database Cloud Self Service Potal screen

  • Left hand side
    •  Notification – any instances that are about to expire
    •   Usage
      •     databases (number provisioned out of maximum)
      •     schema services
      •     Memory
      •     Storage
  • Right side
    •   Top
      •     Databases
    • Bottom
      •     requests – requests that created the database services and the database instances

Screen Shot 2013-06-09 at 11.08.02 PM

Figure 44. To clone a database, choose the “Request” then “Database” menu.

Screen Shot 2013-06-09 at 11.08.40 PM

Figure 45. From the list choose a Self Service Template. In this case “SOEDB Service Template”

Options are

  •   RMAN backups which are full clones
  •   empty databases
  •   snap clone which are thin clones

Screen Shot 2013-06-09 at 11.09.42 PM

Figure 46. Fill out the clone Service Request

request wizard asks for

  • request name
  • select zone – collection of servers
  • select a start and end time
  • provide a user name and password, new user and password

Users do not get system access to the databases but instead get a slightly less privilege user who becomes the owner of the database

Hit Submit

Screen Shot 2013-06-09 at 11.10.25 PM

Figure 47. Shows new clone database


References – setup – provisioning!


cloning, em, Oracle , ,

Metrics vs Statistics

September 4th, 2013

Here are  the tuning metrics tables (SQL  stats are not in “metric” tables per say)

(*DBA_HIST_…_HISTORY views are sort of confusing. AFAI remember they were storing alert history, but apparently they are used for adaptive thresholds – an area for future investigation)


I’ve noticed a number of people posting queries using DBA_HIST_SYSSTAT instead of DBA_HIST_SYSMETRIC_SUMMARY which leads me to believe that there is some confusion or lack of information on the metric tables.

Oracle 10g introduced metric tables which compute deltas and rates of statistics thus hugely simplifying the ability to answer simple questions like “what is the I/O rate on my databases right now.” This question, before 10g, was surprisingly tedious to answer. To answer the question one would have to query v$sysstat  for example:

Select value from v$sysstat where name=’physical reads’;

but querying v$sysstat just once fails to answer the question but instead answers the question “How much I/O has been done since the database was started”. To answer the original question one would have to query v$sysstat twice and take the delta between the two values:

  • Take value at time A
  • Take value at time B
  • Delta = (B-A)
  • and/or get Rate = (B-A)/elapsed time

Getting these deltas and rates could be a pesky task especially working with a customer over the phone. Then 10g Oracle introduced metric tables which answer the questions in one single query using


such as

from v$sysmetric
where metric_name='Physical Reads Per Sec';
---------- ----------------- ------------
654.6736 Reads Per Second          5959
134.9835 Reads Per Second          1515

Notice that the query returns 2 rows. The first row is the the last minute (ie 59.59 seconds) and the second row is the last 15 seconds (ie 15.15 seconds). Oracle collects both the deltas and rates for 60 second and 15 second intervals.

Oracle has the average, maximum, minimum for the values for the last hour in


that one can query like:

where metric_name='Physical Reads Per Sec';

---------- ---------- ---------- ------------------
3.71784232          0 .076930034         .478529283

Also for the last hour Oracle stores the 60 second intervals and for the last 3 minutes the 15 second intervals in



Then for the last week by default, Oracle saves the values for each hour including the maximum, minimum, average, stddev etc in





One issue with using

  • V$SYSMETRIC – last 15 and 60 seconds
  • V$SYSMETRIC_SUMMARY – values  last hour (last snapshot)  like avg, max, min etc
  • V$SYSMETRIC_HISTORY – last hour for 1 minute, last 3 mintes for 15 second deltas
  • DBA_HIST_SYSMETRIC_SUMMARY – hour summaries for last week.

is becoming familiar with the statistics names which are different from v$sysstat. We can look at


For the group_names (statistic definitions)

For trending data over multiple days, the view DBA_HIST_SYSMETRIC_SUMMARY can be used.  The view has a history of all theSystem Metrics Long Duration statistics.  If you store multiple databases in the same AWR repository you can check the statistics available to a particular DBID with the view DBA_HIST_METRIC_NAME.



Easy query


The view DBA_HIST_SYSMETRIC_SUMMARY  can be queried easily for trending metrics, such as the simple query for bytes read by the database per second:

select   dbid, to_char( begin_time ,'YYYY/MM/DD HH24:MI'),
from     dba_hist_sysmetric_summary
where    metric_name= 'Physical Read Total Bytes Per Sec' /* and DBID=[dbid if share repository] */
order by begin_time;

Tougher Query


Compare the above query  to the same query on DBA_HIST_SYSSTAT (note there are a lot of stats in v$sysstat)

with stats as (
           select sn.dbid,
                  to_char(cast(begin_interval_time as date ), 'YYYY/MM/DD HH24:MI') btime,
                  -- Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid ORDER BY sn.snap_id)
                  Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid,st.startup_time ORDER BY sn.snap_id)
                  st.value     value_end,
                  (cast(end_interval_time as date) - cast(begin_interval_time as date )) * (24*60*60) delta
                  DBA_HIST_SYSSTAT  st,
                  DBA_HIST_SNAPSHOT sn
                    sn.snap_id=st.snap_id and
                    sn.dbid=st.dbid and
                    (st.stat_name= 'physical read total bytes')
           order by begin_interval_time
          dbid, btime,
          round((value_end-value_beg)/delta) rate_per_sec
   from stats
   where (value_end-value_beg) > 0

Its a bit disconcerting to note that the above two queries don’t return the exact same data on my laptop. If it was roughly the same that would be fine, and in general the stats are the similar but there are cases where they differ dramatically.  I don’t see anything obvious in the way the queries are written.  Possibly has to do with database bounces or the way the database is affected by the laptop’s sleep and hibernate modes. Will have to look into this farther.

One trick to make the data easy to load into Excel is to use the html output format and spool to a file with an “.html” extension

SET markup HTML on
spool output.html


Other info


List of DBA_HIST views


Power Struggle Between a Man and a Woman

Oracle, performance , , ,

Wait Metrics vs Wait Events

September 3rd, 2013

Here is a quick table comparison of  different types of metrics views


The first line of the table is the classic wait event and statistic views. The following lines are the metric views.  The metric views were introduced in Oracle 10g.

Why Metrics are good

Metric views compute deltas and rates  which hugely simplifying the ability to answer simple questions like “what is the I/O rate on my databases right now?” This question, before 10g, was surprisingly tedious to answer. To answer the question one would have to query v$sysstat  for example:

Select value from v$sysstat where name=’physical reads’;

but querying v$sysstat just once fails to answer the question but instead answers the question “How much I/O has been done since the database was started?” To answer the original question one would have to query v$sysstat twice and take the delta between the two values:

  • Take value at time A
  • Take value at time B
  • Delta = (B-A)
  • and/or get Rate = (B-A)/elapsed time

Getting these deltas and rates could be a pesky task especially working with a customer over the phone. Then 10g Oracle introduced metric tables which answer the questions in one single query .

Using Metrics with Waits

Time business concept.

The metric views  apply to wait events as well as  statistics. In a future posting we will go over statistics. In this posting we will go over wait events.  The number of views available to analyze wait events can be confusing. The point of this post is to clarify what the different views available are and how they can be used.

The wait event views are   (at system level)

  • V$SYSTEM_EVENT – wait events cumulative since startup
  • V$EVENTMETRIC – wait event deltas last 60 seconds
  • DBA_HIST_SYSTEM_EVENT – wait events by snapshot (hour) for last week, cumulative since startup

The wait events are rolled up in to groups called wait classes. For wait class we have the following views:

  • V$SYSTEM_WAIT_CLASS – cumulative since start up
  • V$WAITCLASSMETRIC – last 60 seconds deltas
  • V$WAITCLASSMETRIC_HISTORY – 60 seconds deltas for last hour

Note: DBA_HIST_WAITCLASSMETRIC_HISTORY is used for alerts and or baselines not everyday values.

Use Wait Event Metrics for Latency

I use wait event metrics for I/O latencies.

It may be surprising that I don’t mention using waits to identify bottlenecks and load on the system. For bottlenecks and load on the system the data in V$ACTIVE_SESSION_HISTORY (ASH) is probably better for a few reasons. One the data in ASH is mult-dimesional so it can be grouped by SQL and Session Also CPU information is derivable from ASH. CPU information is not in the event/waitclass views but is in ASH along with the waits.

The second part, the  latencies, specifically I/O latencies,  are only available in the wait event and waitclass views (and the filestat views on a per file basis)

User I/O latency with WAIT CLASS

One  use  of wait  metrics is determining the average read I/O  for all the various kinds of read I/O and read sizes:

select 10*time_waited/nullif(wait_count,0) avg_io_ms -- convert centi-seconds to milliseconds
from   v$waitclassmetric  m
       where wait_class_id= 1740759767 --  User I/O

One issue with V$WAITCLASSMETRIC is that the field WAIT_CLASS name is not in the view, so we either have to use the WAIT_CLASS_ID (the hash of the name)  as above or join to V$SYSTEM_WAIT_CLASS as below

        10*m.time_waited/nullif(m.wait_count,0) avgms -- convert centisecs to ms
from   v$waitclassmetric  m,
       v$system_wait_class n
where m.wait_class_id=n.wait_class_id
  and n.wait_class='User I/O'

Another issue is that the documentation for 11gR2 says that the TIME_WAITED is microseconds but in my tests it’s actually centisecs

Name                                        Type
-----------------------------------------  ----------------------------
WAIT_CLASS_ID                                NUMBER
WAIT_CLASS#                                  NUMBER
WAIT_CLASS                                   VARCHAR2(64)
TOTAL_WAITS                                  NUMBER
TIME_WAITED                                  NUMBER  - centi-seconds

You can get a list of all the WAIT_CLASS names in the view V$SYSTEM_WAIT_CLASS.  

 select wait_class_id , wait_class from V$SYSTEM_WAIT_CLASS ;

------------- ----------------------------------------------------------------
   1893977003 Other
   4217450380 Application
   3290255840 Configuration
   4166625743 Administrative
   3875070507 Concurrency
   3386400367 Commit
   2723168908 Idle
   2000153315 Network
   1740759767 User I/O
   4108307767 System I/O

Latencies for specific I/O  Wait Events

For specific I/O latencies there are two choices – v$eventmetric and v$system_event. With v$system_event it requires  running multiple queries and taking the deltas but the deltas are are already calculated in v$eventmetric. Here is an example of getting I/O latencies from v$eventmetric

Latencies in the past minute

col name for a25
select m.intsize_csec, ,
       round(m.time_waited,3) time_waited,
       round(10*m.time_waited/nullif(m.wait_count,0),3) avgms
from v$eventmetric m,
     v$event_name n
where m.event_id=n.event_id
  and in (
                  'db file sequential read',
                  'db file scattered read',
                  'direct path read',
                  'direct path read temp',
                  'direct path write',
                  'direct path write temp',
                  'log file sync',
                  'log file parallel write'
------------ ------------------------- ----------- ---------- ----------
        6017 log file parallel write         2.538          4      6.345
        6017 log file sync                   2.329          1     23.287
        6017 db file sequential read             0          0
        6017 db file scattered read              0          0
        6017 direct path read                    0          0
        6017 direct path read temp               0          0
        6017 direct path write                   0          0
        6017 direct path write temp              0          0

Latencies averaged over each hour for specific I/O Wait Events

       round((time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0),3) avg_ms
from (
       to_char(s.BEGIN_INTERVAL_TIME,'DD-MON-YY HH24:MI')  btime,
       total_waits count_end,
       time_waited_micro/1000 time_ms_end,
       Lag (e.time_waited_micro/1000)
              OVER( PARTITION BY e.event_name ORDER BY s.snap_id) time_ms_beg,
       Lag (e.total_waits)
              OVER( PARTITION BY e.event_name ORDER BY s.snap_id) count_beg
   and e.event_name like '%&1%'
order by begin_interval_time
order by btime
BTIME               AVG_MS
--------------- ----------
20-JUL-11 06:00      5.854
20-JUL-11 07:00      4.116
20-JUL-11 08:00     21.158
20-JUL-11 09:02      5.591
20-JUL-11 10:00      4.116
20-JUL-11 11:00      6.248
20-JUL-11 12:00     23.634
20-JUL-11 13:00     22.529
20-JUL-11 14:00      21.62
20-JUL-11 15:00     18.038
20-JUL-11 16:00     23.127

What are the sizes of the I/O requests?

One issue with looking at I/O latencies is determining the I/O sizes.  It would be awesome if there was a view with I/O counts, sizes and latencies in one place. ASH does have this information  but ASH data is weighted to the longer latencies and sizes and not the average. The average sizes have to be gotten from system statistics. The I/O sizes for ‘db file sequential read’ are single block reads so are single value that can be determined , but the other read events can vary in size. To get a general idea of I/O sizes one could just average across all I/O using the system statistics

Average I/O Size (across all I/O waits)

          sum(decode(metric_name,'Physical Reads Per Sec',value,0))*max(intsize_csec)/100  blocks_read,
          nullif(sum(decode(metric_name,'Physical Read IO Requests Per Sec',value,0)),0)*max(intsize_csec)/100  reads,
            sum(decode(metric_name,'Physical Reads Per Sec',value,0))/
          nullif(sum(decode(metric_name,'Physical Read IO Requests Per Sec',value,0)),0) avg_blocks_read
from v$sysmetric
where group_id = 2  -- 60 second deltas only (not the 15 second deltas);

----------- ---------- ---------------
       4798       4798               1


Load and Bottlenecks

The good thing about wait classes is that they simplify dealing with 1000s of wait events and group them into just a few wait classes. We can get a quick view of load on the system with

select n.wait_class, round(m.time_waited/m.INTSIZE_CSEC,3) AAS
from   v$waitclassmetric  m,
       v$system_wait_class n
where m.wait_class_id=n.wait_class_id
and n.wait_class != 'Idle'
WAIT_CLASS             AAS
--------------- ----------
Other                    0
Application              0
Configuration            0
Administrative           0
Concurrency              0
Commit                   0
Network                  0
User I/O              .149
System I/O            .002

but the big drawback with wait event and/or wait class views is that they lack information on CPU load. CPU load can be found in the system statistics but it’s just easier to do it all in one query using v$active_session_history. Here is a query using ASH to calculate AAS load on the database over the last 60 seconds:

            round(count(*)/secs.var,3)     AAS,
            decode(session_state,'ON CPU','CPU',wait_class)  wait_class
       from v$active_session_history ash,
            (select 60 var from dual)  secs
            SAMPLE_TIME > sysdate - (secs.var/(24*60*60)) and
       group by decode(session_state,'ON CPU','CPU',wait_class) , secs.var
---------- ---------------
      .016 Concurrency
      .001 Network
         0 Other
      .083 Configuration
      .001 Administrative
      .034 CPU
         0 System I/O
      .001 Commit
      .054 Application
         0 User I/O

Though with v$sysmetric it’s pretty easy to join to v$waitclassmetric

  select n.wait_class, round(m.time_waited/m.INTSIZE_CSEC,3) AAS
   from   v$waitclassmetric  m,
          v$system_wait_class n
   where m.wait_class_id=n.wait_class_id
   and n.wait_class != 'Idle'
   select  'CPU', round(value/100,3) AAS from v$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2 ;
WAIT_CLASS                                                              AAS
---------------------------------------------------------------- ----------
Administrative                                                            0
Application                                                            .009
CPU                                                                   1.696
Commit                                                                    0
Concurrency                                                            .001
Configuration                                                             0
Network                                                                .002
Other                                                                     0
System I/O                                                                0
User I/O                                                                  0

and adding v$sysmetric into the query allows me to do something I’ve always wanted which is to include the OS CPU in AAS

 select n.wait_class, round(m.time_waited/m.INTSIZE_CSEC,3) AAS
   from   v$waitclassmetric  m,
          v$system_wait_class n
   where m.wait_class_id=n.wait_class_id
   and n.wait_class != 'Idle'
   select  'CPU', round(value/100,3) AAS from v$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2
   select 'CPU_OS', round((prcnt.busy*parameter.cpu_count)/100,3) - aas.cpu
  ( select value busy from v$sysmetric where metric_name='Host CPU Utilization (%)' and group_id=2 ) prcnt,
  ( select value cpu_count from v$parameter where name='cpu_count' )  parameter,
  ( select  'CPU', round(value/100,3) cpu from v$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2) aas
WAIT_CLASS                                                              AAS
---------------------------------------------------------------- ----------
Administrative                                                            0
Application                                                               0
CPU                                                                    .009
CPU_OS                                                                 .024
Commit                                                                    0
Concurrency                                                               0
Configuration                                                             0
Network                                                                .002
Other                                                                     0
System I/O                                                                0
User I/O                                                                  0

One huge loss over using ASH is the loss of the information users waiting for CPU but not running on CPU.

For further reading see

  • – shell script to display I/O latency from v$system_event
  • Oracle CPU time – how to see Oracle’s usage and demand of CPU




Oracle, performance, wait events , ,

Enqueue – PK, FK or Bitmap Index problem?

August 30th, 2013

MP900302987If one is seeing waits for enq: TX – row lock contention  then there could be a lot of reasons. One distinguishing factor is the lock mode. If the lock mode is exclusive (mode 6) then it’s most likely a classic row lock where two sessions are trying to modify the same row. On the other hand if the lock mode is share (mode 4)  it’s typically going to be

  • Primary/Unique Key: inserting a unique key when  someone else has already inserted that key but not committed
  • Foreign Key: Inserting a foreign when then parent value has been inserted but not committed or deleted and not commited (not to be confused with locks due to un-indexed foreign key which cause a “enq: TM – contention”  wait not a TX wait)
  • Bitmap Index: bitmap index chunk contention

Now how to tell which of these is happening? Well here is a query on ASH (I’ve commented out some of the useful fields to limit the output)  that will generate output to help us distinguish between the 3 different cases

col object for A15
col otype for A10
       substr(event,0,20)    lock_name,
       --ash.session_id        waiter,
       --mod(ash.p1,16)        lmode,
       --ash.p2                p2,
       --ash.p3                p3,
       o.object_name           object,
       o.object_type           otype,
       CURRENT_FILE#           filen,
       CURRENT_BLOCK#          blockn,
       --ash.SQL_ID            waiting_sql,
       BLOCKING_SESSION        blocker
         v$active_session_history ash,
         all_objects o
           event like 'enq: TX%'
   and   mod(ash.p1,16)=4
   and o.object_id (+)= ash.CURRENT_OBJ#

Here is the output from the 3 different cases

Unique Index

  ------  ------ ----- ------ -------
      -1             0      0     158
      -1             0      0     158
      -1             0      0     158
      -1             0      0     158

Foreign Key

  ------  ------ ----- ------ -------
   CHILD   TABLE     1  60954       1
   CHILD   TABLE     1  60954       1
   CHILD   TABLE     1  60954       1

Bitmap Index

  ------  ------ ----- ------ -------
      I1   INDEX     0      0     144
      I1   INDEX     0      0     144
      I1   INDEX     0      0     144
      I1   INDEX     0      0     144

Each case has a different footprint.

  • Unique key index issue object of “-1″
  • Foreign key case has a blocker of “1″
  • Bitmap index case as filen and blockn “0″

These cases were run on thus the “footprint” could change on other versions.

The above ASH query and many other useful ASH queries are maintained on GitHub at


Oracle, performance, wait events , ,