Archive for March, 2014

Oracle SQL*Net Wait Events

March 24th, 2014



Unfortunately, what Oracle calls “Network Waits” most often have little to do with Network but and almost exclusively to do with the time it takes to pack messages for the network before they are sent.
Client = you, the tool, sqlplus, application
Not the client, the other side = the shadow process is communicating to the client

Of the three waits, only “more data” is possibly related to network issues and that’s not even clear, the other two are simply the time it takes to pack a message before sending it.

SQL*Net message to client – time to pack a message (no network time included) possibly tune SDU
SQL*Net more data from client – possible network issues, possibly tune SDU
SQL*Net more data to client – time to pack a message (no network time included) possibly tune SDU


 The same events exist, but where the client is the shadow process and another database plays the roll of shadow process:


SQL*Net message to dblink 
SQL*Net more data from dblink – possible network issues, possibly tune SDU
SQL*Net more data to dblink 


SQL*Net Wait Events


SQL*Net message from client

Idle Event
Waiting for work from Client
Includes network transmission times for messages coming from shadow

Typically indicative of Client “think time” or “processing time”
Example from Egor Starostin,
From a 10046 trace
   PARSING IN CURSOR #1 len=43 dep=0 uid=0 oct=3 lid=0 tim=1304096237
    hv=2707617103 ad=’89a03e18′
    select * from all_objects where rownum < 20
    PARSE #1:c=0,e=143,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1304096209
    EXEC #1:c=0,e=744,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1304097036
    WAIT #1: nam=’SQL*Net message to client’ ela= 3 driver id=1650815232
    #bytes=1 p3=0 obj#=-1 tim=1304097096
    FETCH #1:c=10000,e=6903,p=0,cr=9,cu=0,mis=0,r=1,dep=0,og=1,tim=1304104057
1->WAIT #1: nam=’SQL*Net message from client‘ ela= 721 driver
    id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1304104865        # [non-idle]
    WAIT #1: nam=’SQL*Net message to client’ ela= 1 driver id=1650815232
    #bytes=1 p3=0 obj#=-1 tim=1304105319
    FETCH #1:c=0,e=627,p=0,cr=21,cu=0,mis=0,r=15,dep=0,og=1,tim=1304105524
2->WAIT #1: nam=’SQL*Net message from client‘ ela= 253 driver
    id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1304105818        # [non-idle]
    WAIT #1: nam=’SQL*Net message to client’ ela= 1 driver id=1650815232
    #bytes=1 p3=0 obj#=-1 tim=1304105867
    FETCH #1:c=0,e=63,p=0,cr=6,cu=0,mis=0,r=3,dep=0,og=1,tim=1304105900
3->WAIT #1: nam=’SQL*Net message from client‘ ela= 1960753 driver
    id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1306066946 # [idle]
    PARSING IN CURSOR #1 len=21 dep=0 uid=0 oct=3 lid=0 tim=1306069444
    hv=2200891488 ad=’89913b50′
    select user from dual
    PARSE #1:c=0,e=60,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1306069440
The first two “SQL*Net message from client’ are in the middle of cursor processing and are considered non-idle waits.
The third “SQL*Net message from client” is between cursors and considered an idle event, ie we are waiting for the next command from the client.


SQL*Net message to client

Time it takes to pack a message to be sent to the client
Doesn’t include network timing
see Tanel Poder’s analysis of SQL*Net message to client


SQL*Net more data to client

Same as SQL*Net message to client except this is for data that spans SDU packets.

Wait represents the time it takes to pack data.
Doesn’t include network timing


SQL*Net more data from client

The only SQL*Net wait that can indicate a possible NETWORK problem
Client is sending data to shadow that spans packets (think large data inserts, possibly large code blocks, large SQL statements)
Shadow waits for next packet.
Can indicate network latency.
Can indicate a problem with the client tool
Here is an example with ASHMON where the application server died mid-stream on inserts. The shadow processes were left waiting for completion of the message. You can see the regular load on the database on the left, then just past the middle the load crashes, and all that’s left is waits on “SQL*Net more data from client”

Possibly set SDU=32768 as well as setting RECV_BUF_SIZE and SEND_BUF_SIZE to 65536.


SQL*Net break/reset to client

Error in sql statement

Control C
Usually highlights and error in application
            (CONSTRAINT T1_CHECK1 CHECK (C1 IN ('J','N')));
            '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';
Trace File
       PARSING IN CURSOR #2 len=25 dep=0 uid=0 oct=2 lid=0 tim=5009300581224 hv=981683409 ad='8e6a7c10'
       END OF STMT
       PARSE #2:c=0,e=2770,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,tim=5009300581220
       BINDS #2:
       EXEC #2:c=0,e=128,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=5009300581418
       ERROR #2:err=1722 tim=512952379
       WAIT #2: nam='SQL*Net break/reset to client' ela= 31 driver id=1650815232 break?=1 p3=0 obj#=-1 tim=5009300581549
       WAIT #2: nam='SQL*Net break/reset to client' ela= 92 driver id=1650815232 break?=0 p3=0 obj#=-1 tim=5009300581662
Unfortunately Oracle doesn’t give much information about debugging unless you are trace. If you don’t trace, the SQL won’t be captured because from Oracle’s point of view the problem statement isn’t an acceptable SQL statement so there is no SQL ID to track down.



These waits are the same as
SQL*Net message to dblink

SQL*Net more data from dblink
SQL*Net more data to dblink
SQL*Net break/reset to dblink


Analysis and Tuning

There isn’t much to do on the Oracle side for tuning. You can try optimizing the SDU and SEND_BUF_SIZE and RECV_BUF_SIZE.
For actually getting information on network speeds you will have to use something like
  • ping
  • tnsping
  • network sniffer



The default SDU can be set in the sqlnet. ora
If it’s not set, the default is 2048
The max is 32768
The default,or the value in sqlnet.ora, can be overridden in the tnsnames. ora and the listener.ora. The client and server negotiate the size aggreeing on the smaller of the two settings.
(TDU – Transmission Data Unit – see note 44694.1 The TDU parameter has been deprecated in the Oracle Net v8.0 and beyond and is ignored. It is only mentioned here for backward compatibility.)
      V10G = (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = fuji)(PORT = 1522))
) )
       (SID_LIST =
       (SID_DESC =
       (SID_NAME = v10g)
       (ORACLE_HOME = /export/home/oracle10)



       trace_unique_client = true


       client_3582.trc:[12-JAN-2008 11:37:39:237] nsconneg: vsn=313, gbl=0xa01, sdu=32768, tdu=32767

more from Jonathan Lewis at



The recommended size for these buffers (from Oracle’s docs) is at least
Network bandwidth * roundtrip = buffer min size
For example if the network bandwidth is 100mbs and the round trip time (from ping) is 5ms then
           100,000,000 bits   1 byte   5 seconds
           ---------------- x ------ x --------- = 62,500 bytes
            1 second          8 bits     1000
           V10G = (DESCRIPTION =
           (ADDRESS = (PROTOCOL = TCP)(HOST = fuji)(PORT = 1522))
           (CONNECT_DATA =
           (SERVER = DEDICATED) (SERVICE_NAME = v10g)
           ) )
           SID_LIST_LISTENER =
           (SID_LIST =
           (SID_DESC =
           (SID_NAME = v10g)
           (ORACLE_HOME = /export/home/oracle10)


performance, wait events

Oracle : buffer busy wait

March 21st, 2014


Oracle 10 and 11

Buffer Busy Waits usually happen on Oracle 10 and 11 mainly because of insert contention into tables or Indexes.  There are a few other rare cases of contention on old style RBS segments, file headers blocks and freelists.
Before Oracle 10 and 11 there was one other major reason which was readers waiting for readers, ie one user does a phyiscal IO of a block into memory and a second user want to read that block. The second user waits until the IO is finished by the first user. Starting in 10g this wait has been given the name “read by other session“.  Before Oracle 10g this was also a “buffer busy wait”.
The easiest way to analyse the bottleneck and find a solution is to use ASH (active session History) available in Oracle 10g with the diagnostics pack license or using Simulated ASH for free or using a product like DB Optimizer.
Data block class, which can be found in ASH,  is the most important piece of information in analysing buffer busy waits. If we know the block class we can determine what kind of bottleneck:
    1. data block
      • IF OTYPE =
      • INDEX , then the insert index leaf block is probably hot, solutions are
        • Hash partition the index
        • Use reverse key index
      • TABLE, then insert block is hot,solutions
        • Use free lists
        • Put Object in ASSM tablespace
    2. Segment header – If “segment header” occurs at the same time as CLASS= “data block” on the same object and the object is of OTYPE= “TABLE”  then this is just a confirmation that the TABLE needs to use free lists or  ASSM.
    3. File Header Block – Most likely extent allocation problems, look at extent size on tablespace and increase the extent size to there are few extent allocations and less contention on the File Header Block.
    4. free lists – Add free list groups to the object
    5. undo header – Not enough UNDO segments, if using old RBS then switch to AUM
    6. undo block – Hot spot in UNDO, application issue
How do we find the block class? With a quick query on the ASH data like:

       o.object_name obj,
       o.object_type otype,
from v$active_session_history ash,
     ( select rownum class#, class from v$waitstat ) w,
      all_objects o
where event='buffer busy waits'
   and w.class#(+)=ash.p3
   and o.object_id (+)= ash.CURRENT_OBJ#
Order by sample_time;
For Example


------ ------ ------------- ------------------
TOTO1  TABLE  8gz51m9hg5yuf data block
TOTO1  TABLE  8gz51m9hg5yuf data block
TOTO1  TABLE  8gz51m9hg5yuf segment header
TOTO1  TABLE  8gz51m9hg5yuf data block
If we find that CLASS=datablock, then we will want more information to diagnose, such as the object type “OTYPE” , object name and what kind of tablespace the object is stored in. The following query provides that information:

set linesize 120

col block_type for a20
col objn for a25
col otype for a15
col filen for 9999
col blockn for 9999999
col obj for a20
col tbs for a10
       nvl(,to_char(bbw.p1)) TBS,
       tbs_defs.assm ASSM
from (
       count(*) cnt,
       nvl(object_name,CURRENT_OBJ#) obj,
       o.object_type otype,
       ash.SQL_ID sql_id,
       nvl(w.class,'usn '||to_char(ceil((ash.p3-18)/2))||' '||
                         0,'block')) block_type,
       --nvl(w.class,to_char(ash.p3)) block_type,
       ash.p1 p1
    from v$active_session_history ash,
        ( select rownum class#, class from v$waitstat ) w,
        all_objects o
    where event='buffer busy waits'
      and w.class#(+)=ash.p3
      and o.object_id (+)= ash.CURRENT_OBJ#
      and ash.session_state='WAITING'
      and ash.sample_time > sysdate - &minutes/(60*24)
      --and w.class# > 18
   group by o.object_name, ash.current_obj#, o.object_type,
         ash.sql_id, w.class, ash.p3, ash.p1
  ) bbw,
    (select   file_id, 
       tablespace_name name
  from dba_data_files
   ) tbs,
 tablespace_name    NAME,
        extent_management  LOCAL,
        allocation_type    EXTENTS,
        segment_space_management ASSM,
     from dba_tablespaces 
   ) tbs_defs
  where tbs.file_id(+) = bbw.p1
Order by bbw.cnt
and the output looks like

  CNT OBJ     OTYPE   SQL_ID        BLOCK_TYPE       TBS        ASSM

----- ------- ------- ------------- ---------------- ---------- ------
    3 TOTO1   TABLE   8gz51m9hg5yuf segment header   NO_ASSM    MANUAL
   59 TOTO1   TABLE   8gz51m9hg5yuf data block       NO_ASSM    MANUAL
Oracle 7, 8 and 9

Before Oracle 10, buffer busy waits also happened because IO blocking another user wanting to do the same IO. On Oracle 9, the main reasons for buffer busy waits are

1)       IO read contention (only Oracle 9i and below)

2)       Insert Block Contention on Tables or Indexes
3)       Rollback Segment Contention

On  7.0  – 8.1.5 see

On version 8 and 9, the p3 value has a different meaning. Instead  of meaning the block type (which is the best thing to know) it means the kind of buffer busy wait. There are only two values that matter to us, values in

100 range = read waits (basically just an IO wait)

Reader blocking Reader, ie one reader is reading a block in and another person wants to read this block and waits on a buffer busy wait p3=130.

200 range = write contetion (same as in 10g)

Writers blocking other writers for example while doing inserts either because of no free lists on the table or because everyone is inserting into the same index block.

If you have set up ASH style collection with S-ASH or have a product like DB Optimizer you can run a query like:


       count(*) cnt,
       o.object_name obj,
       o.object_type otype,
       decode(substr(ash.p3,1,1),1,'read',2,'write',p3) p3
from v$active_session_history ash,
      all_objects o
where event='buffer busy waits'
   and o.object_id (+)= ash.CURRENT_OBJ#
group by o.object_name, o.object_type, ash.sql_id, ash.p3,ash.CURRENT_OBJ#
order by cnt
And see what kind of buffer busy waits there are and what the objects are:


--- ------- ------- ------------ ---------- ------
  1                           -1 1375352856 read
  2                           -1  996767823 read
  2                           -1 2855119862 write
 17                           -1 1375352856 write
 89 TOTO1   TABLE         296030 1212617343 write
109                       296022 1212617343 write

Often the Current_obj# is -1 so we can’t figure out what the object is . There is an alternative method

col block_type for a18

col objn for a25
col otype for a15
col event for a15
col blockn for 999999
col segment_name for a20
col partition_name for a15
col owner for a15
set timing on
drop table myextents;
create table myextents as select * from dba_extents;
        decode(substr(ash.p3,1,1),1,'read',2,'write',p3) p3
from v$active_session_history ash,
     myextents ext
       event = 'buffer busy waits'
   and ( current_obj# = -1 or current_obj#=0  or current_obj# is null )
   --and sample_time > sysdate - &minutes/(60*24)
   --and session_state='WAITING'
   and  ext.file_id(+)=ash.p1 and
        ash.p2 between  ext.block_id and ext.block_id + ext.blocks
group by
Order by count(*)
Because querying DBA_EXTENTS  is a slow operation, I made a copy of DBA_EXTENTS which will be faster to query.


--- ------ -------------- --------------- ------------- --------
  1 SYS    _SYSSMU2$                      TYPE2 UNDO    read
  1 SYS    _SYSSMU3$                      TYPE2 UNDO    write
This second option of getting the object from P1 and P2 (file and block) should probably be done only with the users consent, because we would have to create a copy of the dba_extent table which might take a long time if it’s big.
No ASH ?

If you don’t have ASH data you will have to do some guess work.
Block Class (block type)

The first step in finding out the source of buffer busy waits is looking at

This will tell us what kind of datablocks we have contention on.
File with contention

You can also get an idea of what file contains the object with the buffer busy waits by looking at:

Object with contention

Starting in version 9i there is the table

That will list the objects with buffer busy waits.
If you are on version 7 or 8 good luck finding the object without setting up ASH style data collection.
Why do buffer busy waits happen?

To put it most succinctly, buffer busy waits happen because two users want to change a block at the same time. Two users can change the same block, or even same row “at the same time” ie without committing, but that’s different from the actual operation of modifying the block. The modification on the block in RAM, or computer memory, can only be done by one process at at time in order to avoid memory corruptions. Different users can modify different blocks at the same time but only one user or process can modify a the same block at a time.
In order to really understand what’s going on we have to take a look at how Oracle manages memory and block access and modifications.
Here is the layout of
Above is a diagram shows some of the essential parts of Oracle in regards to performance tuning.
In the machine memory are
  •     Oracle’s SGA, or System Global Area, a memory that is shared between Oracle users
  •     LGWR – log writer process
  •     DBWR – database writer process
  •     User1,2,3 … – user processes, in this case “shadow processes”

On the machine file system are

  • Redo log files
  • Data files
The SGA is composed of (among other things)
  • Log Buffer
  • Library Cache
  • Buffer Cache
What’s important for understanding buffer busy waits is how the buffer cache is managed. Here is view of the buffer cache with more components:
In order to access a block, a user (shadow process) has to get a latch (cache buffer chains latch) which protects buckets or linked lists of buffer headers. Once the header desired if found the latch is released. The buffer headers point to the actual data block in memory. Before modifying a block in memory a user has to lock the buffer header. The buffer header is locked any time a modification is made whether it is reading a block into memory or modifying a block that is already in memory. Usually the header is locked only for a brief amount of time but when there is a lot of concurrent access the buffer header can become a bottleneck.
BBW when readling data – read by other session

A buffer busy can happen on oracle 7,8 and 9 when one user is reading a block into memory and a second user wants to read that block. Instead of the second user trying to read that block into memory as well, they just wait for the first user to finish. Starting in Oracle 10, this kind of wait was renames “read by other session”

BBW on insert
If multiple concurrent users are inserting into a table that doesn’t have free lists or is not in an ASSM tablespace then all users will end up inserting into the same block, the first one on the free list and this block will become the hot block
by adding free lists or moving the table to an ASSM tablespace we will alleviate the bottleneck.
Multiple free lists:
The other option is ASSM or Automatic Segment Space Management which is set at the tablespace level.
In this case free block information is kept in Level 1 BMB (or bitmapped blocks). These Level 1 BMBs are chosen by a hash on the users process ID thus distributing the inserts across the table.
The inserts would look something like this (somewhat exaggerated drawing)
the ASSM BMB blocks take up more space in the table , about 1 extra block for every 16 data blocks and there is overhead first looking in the header/level 3 BMB block then going to the Level 2  then level 1 and finally to the datablock but all in all ASSM is worth reduced costs of management verses free lists.
Identifying and creating ASSM tablespaces
Which tablespaces are ASSM or not?


        extent_management  LOCAL,
        allocation_type    EXTENTS,
        segment_space_management ASSM,
from dba_tablespaces


--------------- ---------- --------- ------
USERS           LOCAL      SYSTEM    AUTO
DATA            LOCAL      SYSTEM    MANUAL
creating an ASSM tablespace:

create tablespace data2 

datafile '/d3/kyle/data2_01.dbf' 
size 200M
segment space management auto;

BBW on index (because of insert)

If users are inserting data that has a rising key value, especially a monotonically rising value, then all the new inserts will have to update the leading edge leaf block of the index and with high concurrent inserts this can cause buffer busy waits.
Hash partition the index
Reverse Key Index

BBW on old style RBS

IF block class > 18 it’s an old style RBS segment

Select  CURRENT_OBJ#||' '||o.object_name objn,

          o.object_type otype,
          CURRENT_FILE# filen,
          CURRENT_BLOCK# blockn,
          w.class ||' '||to_char(ash.p3) block_type
from v$active_session_history ash,
      (select rownum class#, class from v$waitstat ) w,
       all_objects o
where event='buffer busy waits'
    and w.class#(+)=ash.p3
    and o.object_id (+)= ash.CURRENT_OBJ#
Order by sample_time;    


----------- ------ ------ ------ ------------- ------------
54962 TOTO1 TABLE     16   45012 8gz51m9hg5yuf data block 
54962 TOTO1 TABLE     16     161 8gz51m9hg5yuf segment header
0                     14       9 8gz51m9hg5yuf  87
0                     14       9 8gz51m9hg5yuf  87

IF the block is of class > 18, the there will be no object name, so we have to look it up ourselves to be sure:

select  segment_name,  

from     dba_extents 
     &P2  between 
    block_id and block_id + blocks – 1
     file_id = &P1 ;

Plug in 14 for P1 the file # and 9 for P2 the block number:


-------------- --------------
R2             ROLLBACK
move to new AUM or Automatic Undo Mangement
alter system set undo_management=auto  scope=spfile;
BBW on a file header
The ASH data has two different fields that indicate the file # and block # when the wait is a buffer busy wait.
For a buffer busy wait
    File # = p1  *and* File # = current_file#
    Block # = P2  *and* Block # = current_block#
if  p1 != current_file#  or p2 != current_block# then use p1 and p2. They are more reliable.
for example


----- --- --- ---- ----- -- ------ -----------------
11:44 202   2 -1          0      0 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
The real file # is P1 =202 and block # is P2 which is 2
In my database I only had 10 files, so what is this file# 202?!
If you are getting buffer busy waits on the file header block for a tempfile (datafile in a temporary tablespace) then try increasing the “next extent” size in the temporary tablespace.
This wait can happen when lots of extents are being allocated in the temporary tablespace.
What Would ADDM do?
Interstingly enough the ADDM page doesn’t show the new load that has recently come on the system but the analysis is there.  I clicked on the next to bottom line in the page, “Read and write contention on database blocks was consuming significant database time.
Here are the outputs for the different scenarios.
 inserts into a table contention
 inserts into a table with contention on index
RBS contention
File Header Contention


performance, wait events

Health Care Crises in Application Development

March 21st, 2014

if someone fraudulently uses your information for medical services or drugs, you could be held liable for the costs

The demand for healthcare application development is exploding and has been exploding over the past couple of years because of

  • Obama Care – Affordable Care Act
  • Regulatory – HITECH and HIPAA Privacy Acts
  • ICD  10
  • Pro-active Health Care (versus reactive health care)
  • Mobile devices

but to develop applications for health care requires the data to be masked. Why does masking data matter and matter especially for health care? If patient information gets out it can be quite damaging. One heuristic for the importance of healthcare information is that on the black market health care information on an individual tends to sell for 100x the credit card information for an individual. Imagine that someone needs health coverage and they swipe the health care information for someone else giving them free treatment. The value of the “free treatment” can well exceed the maximums on a credit card. Also imagine the havoc it can cause for the original individual if some jumps onto their health care. Important information like blood type can be logged incorrectly or the person my have HIV logged against them when they themselves are clear. It can take years to repair the damage or never if the damage is fatal.

What do Britney Spears, George Clooney, Octomom (Nadya Suleman)and the late Farah Fawcett have in common? They are all victims of medical data breaches! … How much would a bookie pay to know the results of a boxer’s medical checkup before a title bout? What would a tabloid be willing to pay to be the first to report a celebrity’s cancer diagnosis? Unfortunately it doesn’t stop there and the average citizen is equally a target.  

When data gets to untrusted parties it is called leakage. To avoid leakage, companies use masking. Masking is a form of data mediation or transformation that replaces sensitive data with equally valid fabricated data. Masking data can be more work on top of the already significant work of provisioning copies of a source database to development and QA. Development and QA can get these database copies in minutes for almost no storage overhead using Delphix (as has been explained extensively on previous blogs) but by default these copies, or virtual databases(VDB), are not masked.  Without Delphix, to mask database copies in development and QA would require masking every single copy, but with Delphix one can provision a single VDB, masked that VDB, and then clone in minutes for almost no storage as many masked copies of that first masked VDB as needed.

 Screen Shot 2014-03-21 at 10.38.14 AM

In the above graphic, Delphix links to a source database, and keeps a compressed version along with a rolling time window of changes from the source database. With this data Delphix can spin up a clone of the source database, anywhere in that time window. The clone can be spun up in a few minutes and takes almost no storage because it initially shares all the duplicate blocks on Delphix. This first VDB can be masked and then clones of the masked VDB can be made in minutes for almost no extra storage.

With Delphix in the architecture making masked copies is fast, easy and efficient. The first VDB that is masked will take up some extra storage for all the changed data. This VDB can then become the basis for all other development and QA masked copies so there is no need to worry about whether or not a development or QA database is masked. Because the source for all development and QA copies is masked then there is no way for any unmasked copies to make it into development and QA. Without the secure architecture of Delphix  it becomes more complicated to verify and enforce that each copy is indeed masked. By consolidating the origins of all the down stream copies into a single set of masked shared data blocks, we can rest assured that all the down stream versions are also masked. The cloning interface in Delphix also logs all cloning activity and chain of custody reports can be run.

How do we actually accomplish the masking? Masking can be accomplished with a number of technologies available in the industry. With Delphix these technologies can be run on a VDB in the same manner that they are currently being used with regular physical clone databases. Alternatively Delphix has hooks for the provisioning where tools can be leveraged before the VDB is fully provisioned out.

Delphix has partnered with Axis Technology to streamline and automate the masking process with virtual databases. Look for upcoming blog posts to go into more detail about Axis and Delphix.



Delphix Modernization Engine

March 18th, 2014

Delphix Modernization Engine is built on version 4 of our Agile Data Platform. Delphix built the Modernization Engine to help companies trying to migrate or consolidate data centers (including migration to private, public clouds) or retire/rationalize application portfolios. The product is designed to dramatically reduce project risk, as well as time and run-rate IT costs. Key functions include:

Virtualize entire application stacks: Delphix can now do for applications and file systems what we already did for databases. The product can virtualize app binaries, config files, other related files, as well as databases, track changes via TimeFlow, and provision space-efficient virtual copies in minutes.

Auto-transform from Unix to Linux: Many companies have hundreds or thousands of Oracle databases running on legacy Unix platforms, often at 4-8x operating and maintenance costs of x86 platforms. Converting these databases from Unix to x86 Linux requires extended, manual, and error-prone efforts. Delphix accelerates the process significantly with our new data transformation feature that will automatically convert Unix databases to Linux.

Delphix also makes it easy to move many copies of databases from one datacenter to another

Screen Shot 2014-03-17 at 4.13.58 PM

Screen Shot 2014-03-17 at 4.04.55 PMScreen Shot 2014-03-17 at 4.05.11 PMScreen Shot 2014-03-17 at 4.05.22 PM

Delphix 4.0 webinar


Delphix 4.0 Performance Analytics

March 17th, 2014

This is super cool, fine grain powerful performance analytics in Delphix 4.0 just released!

To understand what the analytics shows, it is important to understand the role Delphix plays and where  Delphix sits in the architecture of data.  Delphix is used to provide clones of a data. The most well know Delphix data cloning example is the use case of cloning Oracle databases. If I had three normal copies of an Oracle database, then each of those copies would be using their own separate copies of the datafiles, as shown in the image below on the left side. With Delphix, the 3 copies of the same database would share one compressed copy of the data files. Delphix would orchestrate and manage interaction with these datafiles such that any modifications would be stored separately from the original files and that these modification would only be visible to the clone that made the change.  As far as the clone is concerned, it sees Delphix as a NAS and it sees the datafiles as it’s own private copy of the datafiles.  The only impact Delphix has on the Oracle virtual clones is I/O latency, thus being able to monitor I/O latency is of great interest and Delphix delivers in spades.

Screen Shot 2014-03-17 at 9.39.17 AM

The Delphix analytic interface shows in rich detail  the two main components of I/O latency which are NFS response from Delphix and the I/O response to Delphix. The Delphix interface shows NFS latency, IOPs and throughput from Delphix to the Oracle instances and the Delphix analytic interface shows the latency, IOPs, and throughput Delphix is getting from the storage that has been assigned to Delphix.

Screen Shot 2014-03-17 at 11.28.19 AM
If there is any concern about I/O latency on the Oracle instance using Delphix then  the analytics interface will show the latency of the NFS response from Delphix as well as the I/O response from the underlying storage. With these two data points one can determine if the underlying storage has issues or if Delphix is adding latency. If Delphix is adding latency, then the two main concerns are how much CPU Delphix is using and whether the network communications are maxing out the network pipes. For example with network, the most we can push over 1GbE is around 100MB/sec.  Once we hit that wall we need to either user a bigger pipe like 10GbE or uses multiple pipes bonded. Similarly if we max out the CPU on Delphix the we need to add CPU.

Delphix performance analytics gives all of this. The layout looks like

Screen Shot 2014-03-17 at 1.05.30 PM


An actual screen capture example looks like

The above example shows the top graph for CPU usage on Delphix.

The second section is network throughput. There are two lines because this system is using two separate NICs.

The third section is the I/O throughput and latency from the underlying storage Delphix is using.

There could be a 4th  and/or 5th section  would be NFS statistics for UNIX clients or iSCSI statistics for windows clients. These sections can be displayed or hiddened by clicking the check boxes in the top left of the screen.

The third section, the I/O section gives a heat map of I/O latencies. Using only an average value for I/O latency hides important information, so Delphix shows rich detailed data. The I/O can also be separated out into just reads or just writes or the aggregate of both reads and writes which is the default. To show only reads or writes, click on the legend items in the top right of that section.

The graphs have powerful customization options like the time window selectors below each graph. For the heat maps Delphix allows the user to zoom in on particular ranges of latency of interest.



Delphix 4.0 webinar

See minute 32:20 for a discussion of performance analytics




Enterprise Performance Tuning in the Age of Agile Data

March 14th, 2014

The new Oracle Scene magazine is out. Checkout pages 48-50 for an article by  me.

Here are the awesome contents of Oracle Scene magazine Spring 2014:

  • READING EXECUTION PLANS by Jonathan Lewis  Ace Director 18
  • ANGELS IN THE ARCHITECTURE by Chris Muir & Frank Nimphius 28
    • The Ultimate Modernisation Challenge by Mia Urman Ace 35
    • How Social Can Sales Teams Get With Oracle Sales Cloud by Prity Tewary 24
    • Speeding Financial Close: Cross-Instance E-Business Suite Transaction
    • Drilldown by Jay Langhurst 14
    • The ERP OBA Implementation Secret by Dale Kaplan & Sagar Malvankar 39
    • Win BIG with LinkedIn Groups by Dave Sumner Smith 46
    • Focus on Ireland: 10th Anniversary Celebration 08
    • UKOUG A Year Ahead 10
    • UKOUG Events Timeline 2014 22
    • Women in IT by Debra Lilley Ace Director 05
    • Public Speaking by Tim Hall Ace Director 17
    • Meet a Volunteer: Simon Haslam Ace Director 27
    • News and Reviews 06
    • Enterprise Performance Tuning in the Age of Agile Data by Kyle Hailey Ace  48
    • Large Pages in Linux by Martin Bach Ace Director 51
    • Using SQL Plan Management for Performance Testing by Maris Elsins 56
    • Slow Flashback Query? by Chris Lawson Ace 61
    • Moving Forward with B2B Modern Marketing by Ruby Cameron 63


Enterprise Performance Tuning in the Age of Agile Data


Oracle Cross Platform Provisioning: Magic from the Mess

March 12th, 2014

Linux is taking over the workloads from the shrinking market share of

  • IBM AIX pSeries
  • Solaris Sparc Servers
  • HP-UX Itanium
because Linux provides a reliable and less expensive platform. The costs reductions per users according to IDC for using Linux over UNIX can be on the order of 1/5 the price per end user
Legacy RISC Systems Cost $1,425/User versus Linux at $ 256/User 
– IDC 2012
The LINUX systems will represent the marjority of the market by 2017
By 2017, 65% of applications running on Unix in 2012 migrate to x86 Linux 
ERP scale-out architectures on x86 platforms will rise from a 10% in 2005 to 80% in 2017
 – Gartner
How will current UNIX systems migrate to Linux?

For Oracle customers, how can they move data from Oracle on UNIX to Oracle on Linux? There is a magic technlogy from Delphix that will provide this conversion.

For an explanation of both Delphix’s method as well as how to manually convert a database without Delphix read the continuation at

By the way, cross platform conversion can take a long time. The amount of time depends on the size of the database, the number of objects in the database and the hardware being used for the database while converting.


At Delphix the largest database we have converted is 100TB.  We don’t recommend currently tackling a database of more than 10TB because of conversion time. Basically we recommend
  • Database < 10 TB
  • Database < 10K objects
  • Non-EBS, non-peoplesoft  (due to large object count)
These issues will also slow down any conversion using RMAN convert. So if you have a large database, say 150TB with > 10K objects, how will you convert it? It will easily take a week. It’s going to be even harder without Delphix, but if you have a database that doesn’t pass one of the above criteria then it will probably take some consulting time. See Oracle support document 1389592.1 on Oracle’s recommendations for cross platform conversion.
If you want to use Delphix, you have to provide us excellent
  • Network latency (sub-milli)
  • Bandwidth(Min 10 GbE),
  • A class storage(typically SSD’s)
  • beefed up Compute and Memory for Delphix ( 512 GB dedicated, 24 vCPU).

Although given a small databases you can do the conversion on your own and even for free with Delphix Express ( the free version of Delphix).

Here is a little flow diagram on weather Delphix recommends using Delphix for the conversion:

Screen Shot 2016-03-08 at 12.10.23 PM

PS here is  a discussion with someone trying to convert 500TB though the discussion, so far, hasn’t show anything great,9

And here is a more in depth discussion on Oracle-L,9



Delphix 4.0 released ! The Modernization Engine

March 11th, 2014

Delphix is amazing technology. I’ve been at Delphix for almost 4 years. The first year I was at Delphix was the first year Delphix was released on the market. Even from that first release Delphix revolutionized how we managed and think about data. Data goes from an onerous massive burden to instead a  fast, easy,  powerful resource. As one customer put it data goes from being your enemy to being your friend (and a powerful one at that).

The first production version of Delphix, released in 2010, was version 2. Version 2 virtualized Oracle databases. What does virtualizing a database mean? It means  that Delphix can provision copies of Oracle databases in minutes for almost no storage overhead to any where on your network by using NFS or iSCSI and by sharing the duplicate blocks across all the copies. More than that, Delphix collects and tracks all the changes from a source database so that clone copies can be provisioned from different points in time. The time window that  clones can be made is 2 weeks by default but can be configured to be 2 months or 2 years. Data older than this time window is purged automatically. All the different clones can co-exist concurrently and independently  meaning they can be created and drop, they can  modify data, drop tables, etc without affecting each other.

Version 3 was released in 2012 and introduce database virtualization not only for Oracle but also MS SQL Server.

Now version 4 , released n 2014, takes data virtualization beyond just databases by adding virtualization of the application stack. The application stack became the provisioning bottleneck for our customers after we had eliminated the bottleneck on provisioning the database clone copies. Delphix has gone from a database virtualization company to a data virtualization company.  The Delphix appliance has now become know an Agile Data Appliance (ADP).

The core new features of Delphix version 4 are

  • Application stack virtualization
  • Postgres database virtualization
  • Cross platform provisioning for Oracle (Unix to Linux conversion)
  • Powerful performance analytics monitor and dashboard
  • Customized optimized network performance layer including compression options

Application Stack Virtualization

Application stack data virtualization pales in comparison to the complexity of virtualizing the data in databases. At it’s simplest, application stack virtualization would simply present a set of application files, for example it could be an ORACLE_HOME directory with all it’s binaries.  On the other hand, for applications stacks like financials, provisioning copies of an application stack a require significant modifications for each new copy. As part of the Delphix 4.0 version, we have provided an Oracle EBS application stack virtualization module that manages and drives the custom modifications required for provisioning copies of an Oracle EBS application stack. Look forward to more custom modules coming from Delphix for other application stacks like Peoplesoft and SAP and also look forward to future integration with tools such as Puppet and Chef.

Postgres virtualization

Along with the support for virtualization Oracle and MS SQL Server, PostgreSQL has also been added to the stack, including  support for EnterpriseDB Postgres Plus Advanced Server.  Future versions of Delphix will continue to expand the reach of virtualization. Sybase support is currently in limited beta release.

Cross platform provisioning for Oracle

One of the most exciting, super wow geek factor features of Delphix 4.0 is the cross platform provisioning functionality.  Cross platform provisioning allows a database whose source is on a version of UNIX (such as HP/UX, AIX or Solaris Sparc)  to be provisioned onto a Linux platform. This kind of feature is the kind of feature that gives an inkling to the strength and knowledge of the engineers at Delphix. The Delphix engineers have figured out how to make the changes in endianness such that the Linux version of the database only takes  approximately an additional 1% more storage to save the Linux version of the database along with the UNIX version of the database. Look forward to the next blog post for more information about cross platform provisioning.

Analytics monitor

I’m a performance data and graphical interface junky so I’m super excited about the powerful new performance monitoring and analytics UI releasing with Delphix 4.0. Look forward to a blog post in the next few days  on this feature.

Customized Network Communication Layer

Delphix 4.0 introduces a customized network communication layer for certain operations, called Delphix Session Protocal (DSP). Delphix Re-architected SnapSync and Oracle V2P (virtual to physical provisioning) to function over the Delphix Session Protocol which, among other things, has built-in compression, security and resiliency to temporary network outages, while improving the performance of each of these services.

For more info see



Gartner & Delphix : DELIVER APPLICATIONS with better, faster data

March 10th, 2014

Screen Shot 2014-03-10 at 12.21.07 PM

Orignal PDF for this content available here.

A new video webcast, features Gartner research VP, Jim Duggan,  and Delphix CEO, Jed Yueh, discussing how IT management should best add business value when data volumes are growing exponentially. Their conversation is an enlightening exploration of a very topical challenge: delivering application projects on time, on budget, and on quality. This at-a-Glance Guide summarizes the 7 essential questions Jim and Jed discuss. For a deeper dive into these issues, you can watch the full version here

1. How should CIOs think about technology for managing data?

CIOs are focused on managing schedules and costs – and, of course, on mitigating their organizations’ risk exposure. but the business is demanding technology support and business capabilities. The CIO needs to think “how can I best deliver that?”

2. What type of projects benefit from agile data?

Agile data helps you get the right data into the hands of the right team at the right time. so it impacts the entire range of application projects.

3. Why is data management so difficult now?

Development and data processes have both changed significantly. Data is more volatile, there’s more of it, and it comes from more sources. Test centers need to be consolidated, which means data provisioning times can become a serious obstacle to progress. Put simply, these changes are rendering traditional approaches to data management increasingly unfit for purpose.

4. How does Delphix accelerate application projects? 

To accelerate complex application projects, a number of technologies need to work in unison. delphix synchronizes data from existing systems and versions any changes. it then enables virtualization of that data so you can  quickly deliver it to the right people – like your development or Qa teams. with Delphix, you can do this in a matter of minutes or seconds, as opposed to the months it can take if you  wait for it operations.

5. What are the business effects of failed projects?

Application projects exist to deliver value to the business within an agreed schedule. Falling behind this schedule will exert a negative business impact. Frequent consequences of late application delivery include getting late to market, excess costs, and a knock-on effect on parts of the business that have to wait until the application is finished.

6. How have Delphix customers benefitted from agile data?

Delphix has helped companies like Ebay eliminate as much as 20 percent of their release times for major platform refreshes. And at Procter & Gamble, Delphix helped reduce a five-week data refresh period to just two days. In fact, a lot of the world’s number one companies have adopted the delphix platform. Why? Because these companies habitually move fastest and outpace their competitors.  Companies like fortune One, Walmart, Pfizer and Cisco have all moved to agile data – because they’re the world leaders, and they plan to stay that way.

7. Who is affected by data management challenges in most firms?

Of course, day-to-day data challenges are typically managed by it teams. but data really
affects the entire organization. Ultimately, CEOs, shareholders, end users and customers are all affected by applications – and by how successfully the organization can get data into the hands of the right teams at the right times.


Rock Star DBA in the age of Agile Data

March 6th, 2014


photo by Jesus Solana

DBAs were cutting edge once upon  a time.  DBAs protected the company jewels and made sure the company’s data was secure, fast and dependable. DBAs discovered new ways to tune  databases,  exposed  how internals worked, architected and proved  optimal database designs.  Now that so much information and knowledge has been created ,  documented, and automated, it seems that DBA work is becoming more and more commoditized. The DBA use to be a rock star job and now it’s just the run of the mill job. How can a  rock star DBA rise above the masses of commodity DBAs?  That question can be answered by finding the answer to this question:

How can a DBA have the greatest impact on increasing the revenue for their company ?

Most DBAs when asked how they can improve the business they think of tuning the database. When they tune the database they generally mean changing some database startup parameters or tuning some slow queries. But tuning startup parameters and slow queries pales in comparison to the impact that improving the architecture that accessed the database. Occasionally a DBA will get involved in application architecture and instead of seeing 10% improvements in performance from tuning queries or startup parameters we see 1000% improvements, 10x improvement. But even those kind of architectural improvements pale in comparison to what a DBA can actually do to improve the bottom line of their company.

Improving the performance of a database makes little to no difference if the applications that use that database are  are failing to give the business the information and market advantages that the business needs to succeed in the market place.  In order for a business to succeed  they need to be able to obtain business intelligence fast and produce the right applications quickly.

To produce applications quickly and supply business intelligence fast requires agile data.  What is agile data? Agile data is getting the right data to the right place fast. Agile data means replacing  costly redundant hardware  and enormous slow bureacrasy with flexible, intelligent software that virtualizes the data.  By virtualizing data we mean that when copies are made, the copies will actually share the exact same storage initially. Thus each new copy takes up zero storage and can be instantly provisioned. As the copies begin to be modified they will take up more storage in order to separately store the new modified data. Copies can be supplied via NFS  to any machine on the network. The result is data can be supplied immediately anywhere for little to no storage.

How do you achieve agile data? Agile data consists of 3 parts

  1. data source syncing, A data source could be a database or application stack. Syncing means continuously collecting changes , keeping those changes for a time window and purging any data that falls out of the time window. The changes are keeps on a specialized storage system (2)
  2. specialized storage system  manages file system block snapshoting, thin cloning, compression, block change time tracking, purging of blocks outside of time window that have been replaced.
  3. data source provisioning – automated provisioning of the source to new location which means not copying data but instead, setting up an NFS mount which exposes thin cloned files to the target.  Thin cloned files initially share all the same data blocks and diverge as the target machines modify the files . There is no moving or copying of data. Data is immediately available.  Not only is it available but it’s a fully atonomous  clone of the data source. The clone is autonomous because, although it initially shares all it’s data file blocks, any changes it makes will be stored separately from the initial shared data blocks. The data can be refreshed, rolled back, modified and/or branched independent of other clones.  In the case of database it also takes care of  parameterizing the database, changing names and file paths, recovering and opening the database.

All of this functionality is called a Data Virtualization Appliance (DVA).  How do you get  one? You can try and build it yourself based on some specialized hardware that has file system snapshots or you can  use Delphix. Delphix is software that runs on any Intel commodity hardware and uses any storage given to it. Delphix maps it’s specialized cloning filesytem onto any storage given to Delphix. Delphix handles automatically all the syncing and collection of changes from a source as well as provides automated provisioning of data out to target hosts. Delphix is so versatile that I have it running on my Mac Book Pro in a VM under Virtualbox and it is provisioning clones of a database on a Linux VM running in Virtualbox as well on my laptop.

What kind of impact does agile data have on companies?

The average project acceleration is 2x !

It’s like taking a team of 100 developers and QA personnel and getting the value of a team of 200!

“Delphix allowed us to shrink our project schedule from 12 months to 6 months.”
– BA Scott, NYL VP App Dev


“It used to take 50-some-odd days to develop an insurance product, … Now we can get a product to the customer in about 23 days.”
– Presbyterian Health


“Can’t imagine working without Delphix”
– Ramesh Shrinivasan CA Department of General Services

At a high level a DVA such as Delphix impacts companies in the following ways:

  • Accelerate  custom application projects by getting databases provisioned in minutes, refreshed and synchronized on demand, and enable developers to each have a full database where they had to share before. Delphix has many SAP and Oracle EBS customer successes.
  • Triage outages in minutes by using database rollback (like a DVR for databases) to any second in time and refreshing those databases iteratively until the root cause is identified and fixed.
  • Eliminate restrictions placed on Business Intelligence users such as ETL load windows, peak hour lock out, and limits on the number of users accessing a data mart.
  • Dramatically improve SLAs (minutes for RTO, seconds for RPO).
  • Reduce the cost of infrastructure – between 50%-80% of non-production storage, hardware, and admin.
  • Enable QA to test with full masked or unmasked data sets throughout the cycle, discovering bugs sooner and producing bug free releases and eliminating QA setup time in between tests.
  • Provide one easy to use interface to provision, refresh, synchronize, rollback, branch, and bookmark many versions of SQLServer and Oracle.
  • Provide a “poor man’s” DR solution, especially for non-production environments. And be able to create virtual from physical, and physical from virtual if needed because of a data corruption or outage.

The enormous power of project acceleration and faster business intelligence data is why the leading companies across  different business lines have adopted Delphix:

  • Fortune #1 Walmart
  • #1 pharmaceutical Pfizer
  • #1 social Facebook
  • #1 US bank  Wells Fargo
  • #1 networking  Cisco
  • #1 cable provider Comcast
  • #1 auction site Ebay
  • #1 insurance New York Life
  • #1 chip manufacture Intel