Oracle CPU Time

August 27th, 2013


 

 

oracle_cpu_bertrand_drouvot.png

Image from 

UPDATE: thanks to a discussion in the comments it’s come to my attention that there should be some more clarification on values used and what the mean at the beginning of this post.

Ever wonder where CPU wait comes from in EM performance screens as seen above? well the following discussion will give you a SQL query to calculate CPU wait. In the above image we see both CPU and CPU Wait as two colors (light green and dark green) in EM as well as two values from the output of the SQL script (CPU_ORA and CPU_ORA_WAIT).

Oracle CPU statistics are measured from calls to the OS to see how much CPU is burned over a given elapsed time (i.e. certain quantity of code). The CPU values are cycles used and not time, thus it does not include time on the run queue waiting to get the CPU.

Oracle ASH, on the other hand, lists all Oracle sessions that want to run from Oracle’s perspective, i.e. they aren’t idle and they aren’t waiting for a non-idle wait event like I/O. Thus ASH includes time spent both running on CPU burning cycles and time spent waiting to get on the CPU.

Thus we can take the the amount of time “On CPU” from ASH and subtract the amount of CPU in Oracle statistics for CPU usage then the remainder is roughly time spent by Oracle sessions waiting to get onto the CPU.

The two challenges to getting the value of “Wait for CPU” are getting  CPU cycles burned and ASH time “ON CPU” into the same units and making sure that we are measuring both over the same interval.

Oracle already reports CPU in % used and in centi-seconds used.  I like to transform this into Average Active Sessions on CPU so I can compare it to my main way of showing ASH data which is Average Active Sessions on “ON CPU” which includes wait for CPU.

There are 3 kinds of CPU in the Oracle stats.

  1. Oracle CPU used
  2. System CPU used
  3. Oracle demand for CPU

Starting in 10g Oracle records both the CPU used by the instance as well as the load on the system in v$sysmetric. This is awesome as we can see how busy the system is and how much of the CPU Oracle is responsible for:

col metric_name for a25
col metric_unit for a25
select metric_name, value, metric_unit from v$sysmetric where metric_name like'%CPU%' and group_id=2;
METRIC_NAME                         VALUE METRIC_UNIT
------------------------------ ---------- ------------------------------
CPU Usage Per Sec              251.067016 CentiSeconds Per Second
CPU Usage Per Txn              5025.52477 CentiSeconds Per Txn
Host CPU Utilization (%)       11.6985845 % Busy/(Idle+Busy)
Database CPU Time Ratio        76.3291033 % Cpu/DB_Time

Now the question is how do we convert these to something useful? For me I put it into the equivalent of AAS and compare it to the core count:

   select 'CPU_ORA_CONSUMED'                                     CLASS,
                    round(value/100,3)                             AAS
             from v$sysmetric
             where metric_name='CPU Usage Per Sec'
               and group_id=2
          union
            select 'CPU_OS'                                        CLASS ,
                    round((prcnt.busy*parameter.cpu_count)/100,3)    sAAS
            from
              ( select value busy from v$sysmetric
                where metric_name='Host CPU Utilization (%)'
                 and group_id=2 ) prcnt,
             ( select value cpu_count
                 from v$parameter
                where name='cpu_count' )  parameter;

CLASS                  AAS
---------------- ----------
CPU_ORA_CONSUMED       .002
CPU_OS                 .022

An AAS of 1 is equivalent to 100% of a core, so, OS CPU is about 2% of a core and of that Oracle used 0.2% of a core.
Not a very active system, and we can look at an active system later, but what I wanted to point out is that this query is missing an important statistic: the demand for CPU by Oracle. We can only add that, AFAIK, by joining in ASH:

   select 'CPU_ORA_CONSUMED'                                     CLASS,
                    round(value/100,3)                             AAS
             from v$sysmetric
             where metric_name='CPU Usage Per Sec'
               and group_id=2
          union
            select 'CPU_OS'                                         CLASS ,
                    round((prcnt.busy*parameter.cpu_count)/100,3)     AAS
            from
              ( select value busy from v$sysmetric
                 where metric_name='Host CPU Utilization (%)'
                   and group_id=2 ) prcnt,
              ( select value cpu_count from v$parameter
                 where name='cpu_count' )  parameter
          union
             select
               'CPU_ORA_DEMAND'                                            CLASS,
               nvl(round( sum(decode(session_state,'ON CPU',1,0))/60,2),0) AAS
             from v$active_session_history ash
             where SAMPLE_TIME > sysdate - (60/(24*60*60));

CLASS                   AAS
---------------- ----------
CPU_ORA_CONSUMED       .001
CPU_ORA_DEMAND          .02
CPU_OS                 .019

So the demand for CPU was higher than the amount consumed. Now the demand for CPU is coming from ASH which is sampled so the accuracy is weak, but in larger sample sets or busier systems it’s pretty darn good. The demand alerts us to CPU starvation on a busy  system.

I like to wrap all this up into a query with all the wait classes to see the overall load on Oracle including CPU consumed by Oracle, CPU demanded by Oracle and CPU used at the OS level:

select
                 decode(n.wait_class,'User I/O','User I/O',
                                     'Commit','Commit',
                                     'Wait')                               CLASS,
                 sum(round(m.time_waited/m.INTSIZE_CSEC,3))                AAS
           from  v$waitclassmetric  m,
                 v$system_wait_class n
           where m.wait_class_id=n.wait_class_id
             and n.wait_class != 'Idle'
           group by  decode(n.wait_class,'User I/O','User I/O', 'Commit','Commit', 'Wait')
          union
             select 'CPU_ORA_CONSUMED'                                     CLASS,
                    round(value/100,3)                                     AAS
             from v$sysmetric
             where metric_name='CPU Usage Per Sec'
               and group_id=2
          union
            select 'CPU_OS'                                                CLASS ,
                    round((prcnt.busy*parameter.cpu_count)/100,3)          AAS
            from
              ( select value busy from v$sysmetric where metric_name='Host CPU Utilization (%)' and group_id=2 ) prcnt,
              ( select value cpu_count from v$parameter where name='cpu_count' )  parameter
          union
             select
               'CPU_ORA_DEMAND'                                            CLASS,
               nvl(round( sum(decode(session_state,'ON CPU',1,0))/60,2),0) AAS
             from v$active_session_history ash
             where SAMPLE_TIME > sysdate - (60/(24*60*60));

CLASS                   AAS
---------------- ----------
CPU_ORA_CONSUMED       .002
CPU_ORA_DEMAND          .03
CPU_OS                 .023
Commit                    0
User I/O                  0
Wait                      0

Ideally I’d want the CPU stats to be subsets of each other so that I could have a graphically stack-able set of statistics

now rolling it all together
with AASSTAT as (
           select
                 decode(n.wait_class,'User I/O','User I/O',
                                     'Commit','Commit',
                                     'Wait')                               CLASS,
                 sum(round(m.time_waited/m.INTSIZE_CSEC,3))                AAS
           from  v$waitclassmetric  m,
                 v$system_wait_class n
           where m.wait_class_id=n.wait_class_id
             and n.wait_class != 'Idle'
           group by  decode(n.wait_class,'User I/O','User I/O', 'Commit','Commit', 'Wait')
          union
             select 'CPU_ORA_CONSUMED'                                     CLASS,
                    round(value/100,3)                                     AAS
             from v$sysmetric
             where metric_name='CPU Usage Per Sec'
               and group_id=2
          union
            select 'CPU_OS'                                                CLASS ,
                    round((prcnt.busy*parameter.cpu_count)/100,3)          AAS
            from
              ( select value busy from v$sysmetric where metric_name='Host CPU Utilization (%)' and group_id=2 ) prcnt,
              ( select value cpu_count from v$parameter where name='cpu_count' )  parameter
          union
             select
               'CPU_ORA_DEMAND'                                            CLASS,
               nvl(round( sum(decode(session_state,'ON CPU',1,0))/60,2),0) AAS
             from v$active_session_history ash
             where SAMPLE_TIME > sysdate - (60/(24*60*60))
)
select
       ( decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS - CPU_ORA_CONSUMED )) +
       CPU_ORA_CONSUMED +
        decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND - CPU_ORA_CONSUMED ))) CPU_TOTAL,
       decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS - CPU_ORA_CONSUMED )) CPU_OS,
       CPU_ORA_CONSUMED CPU_ORA,
       decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND - CPU_ORA_CONSUMED )) CPU_ORA_WAIT,
       COMMIT,
       READIO,
       WAIT
from (
select
       sum(decode(CLASS,'CPU_ORA_CONSUMED',AAS,0)) CPU_ORA_CONSUMED,
       sum(decode(CLASS,'CPU_ORA_DEMAND'  ,AAS,0)) CPU_ORA_DEMAND,
       sum(decode(CLASS,'CPU_OS'          ,AAS,0)) CPU_OS,
       sum(decode(CLASS,'Commit'          ,AAS,0)) COMMIT,
       sum(decode(CLASS,'User I/O'        ,AAS,0)) READIO,
       sum(decode(CLASS,'Wait'            ,AAS,0)) WAIT
from AASSTAT)
/

    CPU_OS    CPU_ORA CPU_ORA_WAIT     COMMIT     READIO       WAIT
---------- ---------- ------------ ---------- ---------- ----------
       .02       .002            0          0          0          0

Now let’s run up some load on a machine and database.
Take two databases, run up the CPU demand on both and add some wait contention. The machine has 24 cores so there is a definitely a problem when the CPU_TOTAL goes over 24. I’m running 14 sessions each trying to burn a core on two different databases. The first few lines the test is ramping up

SQL> /

 CPU_TOTAL     CPU_OS    CPU_ORA CPU_ORA_WAIT     COMMIT     READIO       WAIT
---------- ---------- ---------- ------------ ---------- ---------- ----------
    14.887       .387     13.753         .747          0          0       .023

SQL> /

 CPU_TOTAL     CPU_OS    CPU_ORA CPU_ORA_WAIT     COMMIT     READIO       WAIT
---------- ---------- ---------- ------------ ---------- ---------- ----------
    21.989      7.469     12.909        1.611          0          0       .044

SQL> /

 CPU_TOTAL     CPU_OS    CPU_ORA CPU_ORA_WAIT     COMMIT     READIO       WAIT
---------- ---------- ---------- ------------ ---------- ---------- ----------
    26.595     12.125     11.841        2.629          0          0       .025

SQL> /

 CPU_TOTAL     CPU_OS    CPU_ORA CPU_ORA_WAIT     COMMIT     READIO       WAIT
---------- ---------- ---------- ------------ ---------- ---------- ----------
    27.045     12.125     11.841        3.079          0          0       .025

Historically CPU used by Oracle was derived from

v$sysstat.name=’CPU used by this session’

but this statistic had problems as the value was only updated every time a call ended. A call could be a 1 hour PL/SQL procedure which would thus report zero cpu usage in the stats until it finished and the CPU would spike off the scale.

ASH had always been the most stable way to gather CPU demand, though Oracle has made improvements in gathering CPU statistics. I believe that the time model gathers CPU every 5 seconds in 10g, and in 11g it’s possible that CPU stats are gathered every second

Here is a visual example of a machine that has server memory contention, massive amounts of paging. There is OS CPU being used, but hardly any CPU being used by Oracle which makes sense as it’s an idle database, but what is revealing is the massive amount of CPU wait by Oracle. Oracle only has a little bit of work to do to take care of an idle database but we can see that most of Oracle’s CPU time is wait for CPU time as when it wants to work, pages have to be read back in,

I have my doubts as to the clarity of the layout of the above graph. A possibly clearer graph would be simply adding a line representing available CPU and take out the OSCPU bars. In the above graph I’ve charted OSCPU usage as AAS, ie average active sessions, mixing AAS of the database with AAS at the OS level. I think a  possible clear representation would be to show the Core count line, and draw the OSCPU usage shown upside down from the # of core lines, thus the space from the bottom axis to where the OSCPU reaches down would be available CPU.

 UPDATE

Thanks to the eagle eyes of John Beresniewicz a small error was identified in the above script. The last script didn’t correlate the time windows of v$sysmetric with v$active_session history. They both reported the last minute of statistics but the last minute reported in v$sysmetric could be up to a minute behind those in v$active_session_history, so here is a version that tries to correlate to two time windows so they are in sync

with AASSTAT as (
           select
                 decode(n.wait_class,'User I/O','User I/O',
                                     'Commit','Commit',
                                     'Wait')                               CLASS,
                 sum(round(m.time_waited/m.INTSIZE_CSEC,3))                AAS,
                 BEGIN_TIME ,
                 END_TIME
           from  v$waitclassmetric  m,
                 v$system_wait_class n
           where m.wait_class_id=n.wait_class_id
             and n.wait_class != 'Idle'
           group by  decode(n.wait_class,'User I/O','User I/O', 'Commit','Commit', 'Wait'), BEGIN_TIME, END_TIME
          union
             select 'CPU_ORA_CONSUMED'                                     CLASS,
                    round(value/100,3)                                     AAS,
                 BEGIN_TIME ,
                 END_TIME
             from v$sysmetric
             where metric_name='CPU Usage Per Sec'
               and group_id=2
          union
            select 'CPU_OS'                                                CLASS ,
                    round((prcnt.busy*parameter.cpu_count)/100,3)          AAS,
                 BEGIN_TIME ,
                 END_TIME
            from
              ( select value busy, BEGIN_TIME,END_TIME from v$sysmetric where metric_name='Host CPU Utilization (%)' and group_id=2 ) prcnt,
              ( select value cpu_count from v$parameter where name='cpu_count' )  parameter
          union
             select
               'CPU_ORA_DEMAND'                                            CLASS,
               nvl(round( sum(decode(session_state,'ON CPU',1,0))/60,2),0) AAS,
               cast(min(SAMPLE_TIME) as date) BEGIN_TIME ,
               cast(max(SAMPLE_TIME) as date) END_TIME
             from v$active_session_history ash
              where SAMPLE_TIME >= (select BEGIN_TIME from v$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2 )
               and SAMPLE_TIME < (select END_TIME from v$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2 )
)
select
       to_char(BEGIN_TIME,'HH:MI:SS') BEGIN_TIME,
       to_char(END_TIME,'HH:MI:SS') END_TIME,
       CPU_OS CPU_TOTAL,
       decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS - CPU_ORA_CONSUMED )) CPU_OS,
       CPU_ORA_CONSUMED CPU_ORA,
       decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND - CPU_ORA_CONSUMED )) CPU_ORA_WAIT,
       COMMIT,
       READIO,
       WAIT 
       -- ,(  decode(sign(CPU_OS - CPU_ORA_CONSUMED), -1, 0, 
       --                (CPU_OS - CPU_ORA_CONSUMED))
       --    + CPU_ORA_CONSUMED +
       --  decode(sign(CPU_ORA_DEMAND - CPU_ORA_CONSUMED), -1, 0, 
       --             (CPU_ORA_DEMAND - CPU_ORA_CONSUMED ))) STACKED_CPU_TOTAL 
from ( 
        select 
                min(BEGIN_TIME) BEGIN_TIME,
                max(END_TIME) END_TIME, 
                sum(decode(CLASS,'CPU_ORA_CONSUMED',AAS,0)) CPU_ORA_CONSUMED, 
                sum(decode(CLASS,'CPU_ORA_DEMAND' ,AAS,0)) CPU_ORA_DEMAND, 
                sum(decode(CLASS,'CPU_OS' ,AAS,0)) CPU_OS, 
                sum(decode(CLASS,'Commit' ,AAS,0)) COMMIT, 
                sum(decode(CLASS,'User I/O' ,AAS,0)) READIO, 
                sum(decode(CLASS,'Wait' ,AAS,0)) WAIT 
         from AASSTAT) 
/

The output now looks like

BEGIN_TI END_TIME  CPU_TOTAL	 CPU_OS    CPU_ORA CPU_ORA_WAIT     COMMIT     READIO	    WAIT
-------- -------- ---------- ---------- ---------- ------------ ---------- ---------- ----------
07:23:35 07:24:35	.044	   .024       .002	   .018 	 0	    0	    .001

I’m still open that there might be some more tweaking to do, so your milage may vary. Test, Test, Test and only trust yourself!
As a reminder the code above is on Github as part of the ASH Masters project. Feel free to get a Github account, fork the code and make changes. If you find cool new code or errors in old code, let me know and we will merge it into the ASH Masters project

 

https://github.com/khailey/ashmasters/blob/master/cpu_consumed_verses_cpuwait.sql


Oracle, performance
,

  1. Trackbacks

  1. Comments

  2. BunditJ
    | #1

    Hello Kyle,

    Just a minor typo :

    from
    —-
    select metric_name, value, metric_unit from v$sysmetric where metric_name like’%CPU%’ where group_id=2;

    to-be
    —–
    select metric_name, value, metric_unit from v$sysmetric where metric_name like ‘%CPU%’ and group_id=2;

  3. khailey
    | #2

    @BunditJ: thanks, fixed

  4. | #3

    Excellent article….thanks a lot…
    ciao
    Alberto

  5. | #4

    Excellent article…thanks a lot…
    If I had 1 CPU with 16 or more core?Something change ?
    In your example:
    CLASS AAS
    —————- ———-
    CPU_ORA_CONSUMED .002
    CPU_OS .022

    Is every core utilized by .022 ? or only 1 ? or is the sum af all cores ?

    ciao
    Alberto

  6. khailey
    | #5

    Hi Alberto: the values are in units of cores, so .022 is 1/50th of one core roughly

  7. Jared
    | #6

    This is good stuff Kyle.
    BTW, the ‘with’ clause in the last SQL is missing the ‘w’.
    Classic cut and paste error, I miss the first character all the time when selecting text.

  8. khailey
    | #7

    Hi Jared
    Thanks for the catching the missing letter. Keen eyes!
    Got it fixed up.

  9. Ahmed Abdel Fattah
    | #8

    Excellent article, thanks Kyle.

    Just a small question about generating the above graph from the query:
    did you just schedule the query to run every minute, collected the data and graph it?

    Thanks
    Ahmed

  10. khailey
    | #9

    @Ahmed: yes, I had the run on a schedule in a graphical monitor I wrote. The monitor was a variation on http://www.oraclerealworld.com/web-ash-w-ash/

  11. John
    | #10

    Kyle,
    very interesting.
    What are circumstances to use/not use the CPU_COUNT parameter as threshold to AAS when using a CMT CPU?

    OEM Grid Control uses just CPU cores as threshold, if I’m not wrong. Why?

    Thanks

  12. khailey
    | #11

    @John:
    Great question. I’ve never found it useful to look at hyper-threading counts. Maybe someone out there has and can detail where and why.
    I just use core count.

  13. olivier
    | #12

    Kyle,
    I have allways thought that ‘ON CPU’ really meant then sessions are actively using cpu.
    As far as i understand, and according to your script, ‘ON CPU’ actually includes sessions actively running on cpu but also sessions waiting for cpu. You confirm ?
    Thanks !

  14. khailey
    | #13

    @Oliver: yes, exactly. “ON CPU” means both running on CPU and sessions waiting for CPU.

  15. Volker Bauer
    | #14

    Kyle,
    thanks a lot for your excellent article.
    What a still dont understand is the interpretation of the results using a machine
    with multiple cores. If I have for example 16 cores does it mean that the result
    CPU_ORA_CONSUMED .002 has to be multiplied or divided by 16? Or does it mean that
    .002 is the result over all cores? Please shet some light on it. Brgds Volker

  16. khailey
    | #15

    .002 is measured in average active sessions in terms of CPU. Since we are just talking about CPU you can also just think about it as that much of a CPU core being used over that interval.

  17. Chris Osborne
    | #16

    Kyle,

    a quick question about CPU demand.

    I’m slightly confused by the difference between CPU consumed and CPU demand.

    Output from on of my systems looks like
    SQL> /

    CLASS AAS
    —————- ———-
    CPU_ORA_CONSUMED 9.561
    CPU_ORA_DEMAND 6.57
    CPU_OS 2.688

    I’ve interpreted this as i have a requirement to be using 6.57 cores worth of CPU, but i’m actually using 9.56 Cores worth? Sorry if this is a stupid question, but i’m not sure how to interpret this output.

    Regards,

    Chris

  18. khailey
    | #17

    @Chris: Be sure and use the last version of the script on the post. All the versions of the scripts before the “UPDATE” section don’t ensure that the data from CPU_ORA_CONSUMED is from the exact same 60 seconds as the data from CPU_ORA_DEMAND. CPU_ORA_DEMAND in all the scripts except the last version is the data from the last 60 seconds of the database, but CPU_ORA_CONSUMED could up to 60 seconds older than that. The last version of the script forces the two to correlate by find finding what 60 seconds CPU_ORA_CONSUMED is actually from and forcing CPU_ORA_DEMAND to come from that exact 60 seconds.

  19. Olivier
    | #18

    Hello kyle,
    could you please elaborate (again!) a little bit further on CPU_OS.
    While it’s clear to me that
    CPU_ORA is the average number of oracle sessions running on cpu
    CPU_ORA_WAIT is the average number of oracle sessions waiting for cpu
    and so on for all other columns,
    I hardly understand what CPU_OS relates to in terms of average number of sessions. To me CPU_ORA contains all the cpu consumed by oracle sessions, so how could it be we have sessions using non oracle cpu ?
    Thanks !
    Olivier

  20. khailey
    | #19

    Everything is measured in AAS, which is similar to OS runqueue

    CPU_TOTAL – CPU used on the host
    CPU_OS – CPU used on the host processes but not by Oracle processes, ie CPU_TOTAL – CPU_ORA
    CPU_ORA – CPU used by Oracle processes
    CPU_ORA_WAIT – CPU wanted by Oracle processes but not obtained, ie CPU_from_ASH – CPU_ORA

  21. Eric_DL
    | #20

    Hello Kyle,

    First of all, thanks for this great post.

    I have a few questions that I wanted to ask you.

    I’m working on retrieving some “high level metrics” to give our application admins an insight on the workload their app’s database is currently handling. I think a good starting point is to get the data to actually re-build (on a web portal) the graphs that Oracle’s EM shows on its “Performance” page.

    I’m not a DBA, but all web resources I’ve found, state that Oracle foreground activity is described by : DB Time = CPU Time + non-idle Wait Times

    I’m working on Oracle 11.2 instances on RHEL servers and the problem I have is that “CPU Time + non-idle Wait Times” never equals “DB Time” : it’s always a little above or a little below, within a range of about 3%.
    I’ve read that the precision of times measurements by Oracle, can vary from an OS to another (it may not be perfect yet on Linux), but I doubt the gaps I observe can be caused by “jitter-like” measurement deviations.
    I would rather suspect something more important, like “Wait For CPU Time” for instance, which is not directly measured by Oracle as you (and others) say.

    Note: Sorry for the lengthy intro but I had to describe the context.

    So, my questions are :
    – although “Wait For CPU Time” is not directly calculated by Oracle, is it nevertheless implicitly included in “DB Time” (it would be by adding all ASH CPU Times for instance) ?
    – if “Wait For CPU Time” is included in “DB Time”, as Oracle doesn’t explicitly mention it, then to which wait event class does EM affects it ?

    I don’t know if I’m really on topic here, but I thank you in advance for any help you could provide.

    Eric

  22. khailey
    | #21

    Hi Eric,

    First of a 3% jitter sounds find to me. It’s not 3% that is going to lead you astray. The whole point is finding the big bottlenecks and ignoring the small ones, i.e. theory of constraints.

    DB Time = CPU time + Wait for CPU time + non-idle waits.
    Wait for CPU time is not measured directly and only shows up in ASH data.
    If you calculate DB Time using ASH you will get both CPU time and wait for CPU time which is the main point of this blog post. I don’t use DB Time. WHo cares? What does DB Time acutally mean? What I do use is Average Active Sessions (AAS) which is DB Time/Elapsed time. AAS is like the run queue on UNIX. Also by comparing AAS to number of CPU cores and breaking down AAS into it’s different stats like CPU, Wait for CPU, IO, etc one can quickly see if there is a bottleneck where it is.

    – Kyle

  23. Eric_DL
    | #22

    @khailey
    Sure, I intended to use the AAS metric, and I found it very practical to use the single “v$metric_history” view, which gathers “the complete set of metrics captured by the AWR infrastructure” (as Oracle documentation states it), instead of parsing through several views.
    But my point is that when I add all foreground wait times (divided by INTSIZE_CSEC) and “CPU Usage Per Sec”, the result can already go beyond the value of AAS. That’s why I was questioning the inclusion of “Wait for CPU Time” by Oracle, inside AAS computation.
    Plus it means that on a stacked chart, the total of waits and CPU will be sometimes exceeding the value of AAS !

  24. khailey
    | #23

    Hi Eric,

    There are several ways to calculate AAS. Which ones are you using and comparing ? One is with DB Time/Elapsed time. The other is count(*) from ASH/Elapsed time. These will vary but overall should follow the same general trend. In my experience they are quite close but not always synced up. What I see is that AAS based on ASH is more realtime than AAS based on DB Time. It makes sense as ASH is what’s happening right now while DB TIme depends on counters that don’t always refresh immediately.

    Here is a graphic comparision http://www.oraclerealworld.com/wp-content/uploads/2014/10/Screen-Shot-2014-10-30-at-11.42.19-AM-1024×619.png

  25. Eric_DL
    | #24

    Hi Kyle,

    Thanks for answering so fast and sorry to keep bothering you with that.

    I get all metrics from the same view (v$metric_history), where “Average Active Sessions” equals exactly 100 * “Database Time Per Sec (csec/sec)”. Which makes me believe that this AAS value is indeed computed from DB Time.

    I’ll try to compute AAS from ASH instead to see if numbers fit better together.

    Thanks
    Eric

    PS: Just an example of numbers I get from v$metric_history

    CLASS: System Metrics Long Duration
    INTSIZE_CSEC: 6007

    END_TIME METRIC VALUE METRIC_UNIT
    ————— —————————— ———- ————————
    20141030-195756 Average Active Sessions 4.66699111 Active Sessions
    20141030-195756 Background CPU Usage Per Sec .208058931 CentiSeconds Per Second
    20141030-195756 Background Time Per Sec .02093694 Active Sessions
    20141030-195756 CPU Usage Per Sec 46.730025 CentiSeconds Per Second
    20141030-195756 Host CPU Usage Per Sec 273.181288 CentiSeconds Per Second
    20141030-195756 Host CPU Utilization (%) 11.4538982 % Busy/(Idle+Busy)

    CLASS: Total Time Waited (Foreground)
    INTSIZE_CSEC: 6007

    END_TIME METRIC VALUE METRIC_UNIT
    ————— —————————— ———- ————————
    20141030-195756 Administrative 0 CentiSeconds
    20141030-195756 Application 18006.1179 CentiSeconds
    20141030-195756 Cluster 0 CentiSeconds
    20141030-195756 Commit 32.748 CentiSeconds
    20141030-195756 Concurrency 0 CentiSeconds
    20141030-195756 Configuration 0 CentiSeconds
    20141030-195756 Network 20.817 CentiSeconds
    20141030-195756 Other 0 CentiSeconds
    20141030-195756 Queueing 0 CentiSeconds
    20141030-195756 Scheduler 0 CentiSeconds
    20141030-195756 System I/O 0 CentiSeconds
    20141030-195756 User I/O 7294.5065 CentiSeconds

    Below, your SQL script, showing CPU use on same interval (CPU_ORA consistent with CPU Usage Per Sec above):

    BEGIN_TIME END_TIME CPU_TOTAL CPU_OS CPU_ORA CPU_ORA_WAIT
    ————— ————— ———- ———- ———- ————
    20141030-195656 20141030-195756 2.752 2.282 .467 .003

    COMMIT READIO WAIT
    ———- ———- ———-
    .005 1.214 3.015

    And when I compute foreground CPU and wait events times :

    46.730025/100 + (18006.1179+32.748+20.817+7294.5065)/6007 = 4.68807

    while AAS = 4.66699111

  26. Eric_DL
    | #25

    Hi Kyle,

    I thought I’d give you a final update on what we said last week.

    I’ve compared both AAS values (directly computed by Oracle from DB Time vs count of active sessions in ASH upon same time range) and the latter actually have a worse precision. Plus, an increase in database load surprisingly worsens precision of this ASH-computed AAS value (see numbers below).

    To be exhaustive, I compute AAS from ASH by adding (on same minute as v$metric) :
    – number of sessions of type ‘FOREGROUND’, in state ‘WAITING’ and with wait_class different from ‘Idle’ (there should be none with that wait_class in ASH anyway)
    – number of sessions of type ‘FOREGROUND’, in state ‘ON CPU’

    ## light DB load :
    Oracle AAS = 3.05060185
    ASH-computed AAS = 3
    FG CPU + FG waits = 3.08344

    ## with a bit more load :
    Oracle AAS = 7.5191588
    ASH-computed AAS = 9.43116
    FG CPU + FG waits = 7.60034

    Well, to sum this up, I guess I’ll stick to my previous way of doing things and try to deal with measurements deviations.

    —————————————-

    Coming back to the topic of this blog post (finally!), I’ve also worked on the subject of “Wait for CPU” time and I have one question : ASH takes a snapshot of active sessions once a second, but does it means that an ‘ON CPU’ session at sample time, was actually ‘ON CPU’ the whole time since last sample ?

  27. khailey
    | #26

    Hi Eric,

    Thanks for the data. I’m curious why you think AAS from ASH is worse precision than AAS from counters. I would say that AAS from ASH is likely to be more accurate than AAS from counters specifically, as I mentioned before, because ASH includes wait for CPU where as statistical counters do not. Thus an AAS of 9 during load IMO is a more likely description of load than the AAS from FG CPU + FG waits. Also the AAS from ASH will include BG unless you specifically filter them out. I may not care about BG waits in the first pass, but I do care about BG CPU uses and BG wait for CPU.

    >> ASH takes a snapshot of active sessions once a second,
    >> but does it means that an ‘ON CPU’ session at sample time,
    >> was actually ‘ON CPU’ the whole time since last sample ?

    no but for the purposes of painting a statistical picture of load we treat it that way and it proves to be a generally reliable way to describe performance.

    You are asking some good questions that point out the difference between counters (waits and CPU counters) vs sampling in ASH and the take away for me is that ASH includes wait for CPU and the counters don’t which makes ASH more useful. The fact that ASH has wait for CPU is the point of this blog post. It’s super cool that we can derive wait for CPU from ASH by comparing ASH CPU with CPU burned which is what we get from the counters.

    Now as another wrinkle in the story ASH waits can include CPU wait time, but it’s such a wrinkle that IMO it’s not worth too much consideration.

    Finally, the point of AAS is to look at the big stuff and get away from concentrating on the little which leads to “compulsive tuning disorder” as Gaja Vaidyanatha famously quipped.

    If you are seeing a discrepancy of 3% between using counters and ASH to me that is just noise. Concentrate on the big stuff.

  28. Eric_DL
    | #27

    Hi Kyle,

    Sorry in advance for the length of my post (again!), I reworked it a few times, but it’s the shorter I could get :-/

    To answer your question, the precision issue I see with deriving AAS data from ASH, lies essentially in its sampling frequency, that is IMO too low versus the variation “frequency” of the quantities it samples.
    Although we’re not talking about digital signal processing and the concepts of “Nyquist–Shannon sampling theorem” have no real meaning here, the basic idea is the same : you can’t retrieve digital information about phenomena which duration is shorter than your sampling time interval.
    On your graph, you draw a straight line between a couple of adjacent data points, unaware that a transient variation of value has happened during the time interval between those.

    By contrast with ASH’s 1Hz sampling frequency, counters in Oracle most often have millisecond to microsecond precision (V$SESSTAT, V$SYSSTAT, …), because they are updated by Oracle’s background processes that permanently monitor and account for tasks that have been or are waiting to be done for each particular session. Of course, such background processes update counters (and metrics) each time they are themselves running on CPU, but CPU context switches happen far more frequently than once a second, and this kind of background processes generally have high priorities, which gets them CPU execution time slots more often than other Oracle processes.

    Eventually, counters-derived metrics (V$SYSMETRIC, V$SESSMETRIC, V$WAITCLASSMETRIC, …) average over larger intervals (15s or 1min), the variation rates of data that are sampled and accumulated thousands or millions of times a second, whereas ASH takes a single snapshot, once a second, of non cumulative, versatile data like sessions’ states.

    To sum this up, I’d say that counters-derived metrics trade off more frequent updates, for more precise data values.

    Don’t get me wrong, I agree with the “focus on the big stuff” motto, and with the fact that obsessive focusing on too small details about DB performance is detrimental. My questioning simply raised from the fact that I couldn’t seem to measure the different AAS components and have their sum to actually correspond to the “full AAS” value as measured by Oracle. And when I tried to replace that “full AAS” with the one derived from the sessions’ states count in ASH, the correlation proved to be even worse. I then started examining more in depth how each one was computed.

    All of the above finally leads me to the following conclusions : counters-derived AAS data and ASH-derived AAS data can not be computed together or be part of the same chart, because they’re not directly comparable, and ASH is the one that has the least precision because it aggregates much less data points in time.

    —————————————-

    This in turn, leads me to the “Wait for CPU” subject of this blog post : I’ve been thinking about it and I must confess that I can’t figure out how that information can be derived from ASH samples.

    Let me elaborate.

    I can only think of specific cases where a process or thread has to wait for CPU resources :

    – there’s no CPU available in the system because they’re all busy (system load index is higher than machine’s number of logical CPU/cores)
    – process/thread can’t be run because it needs access to CPU cache data that are currently locked for update by another process/thread
    – process/thread can’t be allocated a core to run, because it needs access to data that exist only in the cache buffer of physical CPU-1, and no core of CPU-1 can be freed at the moment
    – …. and other situations like that …

    but such low level information is only known (and managed) by an OS kernel’s scheduler.

    AFAIK, a regular process doesn’t have access to that level of information, it makes system calls to run a task, but has no clue about which CPU it will actually run on, in how much time it will be served an execution slot, or if there’s any CPU available to run it at all.
    IMO, there’s no way for Oracle to distinguish between processes or threads actually “Running On CPU” or “On CPU Run Queue” (waiting for CPU). Therefore, I think that all “CPU Time” Oracle reports, actually indistinctly incorporate both.

    But I may be wrong. What do you think ?
    Does Oracle publish technical details on that matter ?

  29. khailey
    | #28

    All valid points but the still missing the forests for the trees. Sure a process waiting for data in the registers can be called stalled and this will show up as CPU time from stats, but that’s a smaller issue than is being computed above.
    What is being computed above is wait for CPU because there are more runnable processes than CPU cores available, and yes this computation will be more error prone the smaller interval over which it is taken, but over intervals of 15 seconds or so, it has born out to be useful.

  30. Eric_DL
    | #29

    OK, call me stubborn, but it really bothers me when I feel I’m not understood … and I do now!

    My previous post was too explanatory and not getting straight to the point.

    I made it in a more direct fashion this time (and a bit less pleasant I’m afraid).

    Here we go:

    –1– ASH snaps “ON CPU” sessions, which of course includes sessions waiting for CPU at time of sampling.

    –2– Oracle’s CPU time counters account for the time spent “ON CPU” on behalf of each session, so, like ASH, time spent waiting for CPU is also included here.

    –3– Hence a derived metric like “CPU Usage Per Sec”, already accounts for that “Wait for CPU” time.

    –4– Consequently, (CPU_ORA_DEMAND – CPU_ORA_CONSUMED) cannot yield “Wait for CPU” time.

    –5– IMO, (CPU_ORA_DEMAND – CPU_ORA_CONSUMED) measures nothing else than the approximation error of CPU consumed by Oracle, as seen by ASH.

    –6– And it’s quite logical to me, because you subtract a quantity obtained from high-def data, from one obtained from decimated data (even though they cover the same time range). That’s why the result can sometimes be negative.

    –7– Now I do believe you, when you say that the use of that formula in the field, gives data that corroborate CPU load increase when machines are in CPU starvation. This fact certainly has an explanation, but to me, these numbers are not a measure of “Wait for CPU” time in a strict sense.

    I’m not obsessive about small details, I’m just questioning the validity of this calculus.

    OK, hope I made myself clearer. Now if you consider that I’m plain wrong, I’m all ears, if you wish.

    Thanks for your time and patience anyway.
    Eric

  31. olivier
    | #30

    @Eric_DL
    Eric, are you 100% sure that -2- contains the time spent waiting for cpu ? looking at v$sysmetric, it looks like to me waiting for cpu availability is not accounted in v$sysmetric, but i might be wrong.
    Olivier

  32. khailey
    | #31

    @Eric_DL
    >>Oracle’s CPU time counters account for the time spent “ON CPU” on behalf of each session,
    >> so, like ASH, time spent waiting for CPU is also included here.
    Ahh, this is probably the crucial misunderstanding. CPU counters measure CPU burned not wait for CPU in run queue. CPU burn includes CPU stalls waiting for register data but not waiting on the run queue to be on CPU.

  33. Eric_DL
    | #32

    Hi Kyle, hi Olivier,

    That’s right, knowing whether CPU Time accounts or not for waiting in run queue is a crucial point. You must admit that Oracle docs are more than vague about such details.

    My experience is that, on Linux for instance, system commands most of the time can’t do the difference between running or runnable processes. I’ve also seen some Oracle docs showing an AAS-normalized “CPU Used” metric, going above logical number of cores, which for me was a proof that processes piling up in run queue were included.

    Moreover, I have the following interrogations :
    – if Oracle can really distinguish CPU burned from waiting for CPU, then why isn’t there a specific wait event class for CPU wait ?
    – how can CPU Time not account for processes waiting in run queue, and DB Time include them ?

  34. khailey
    | #33

    Eric_DL: CPU statistics are gathered from the OS by interrogating how many CPU cycles we’ve used
    ‘ON CPU” are processed that are runnable and wanting to run from the Oracle perspective include both processed that are burning CPU cycles and those waiting on the run queue.
    Thus “ON CPU” – CPU stats = wait for CPU
    but of course that is an approximation for lots of issues include that “ON CPU” is a sampled value.
    CPU stats have historically had problems. in the Oracle 10g versions CPU wasn’t reported until a call ended. If your call was a 1 hour PL/SQL loop that just burned CPU, then your system would look idle (if there was no other activity) for that hour then you’d get a massive spike in CPU, ie a dump of all the CPU burned for that hour. In OEM we use to, based on my suggestion/hack idea, lop off the CPU usage at 100%, subtract that value, take the remainder and back fill the Average Active Session chart on the performance page (not the top activity page) until there was none left. Not a real portrait of what happened but a better alternative . Now CPU is suppose to be collected ever 5 seconds AFAIK though I’ve still seen some aberrations though nothing like the 10g days

  35. Olivier
    | #34

    @Eric_DL
    As far as i know a process is not “informed” it is going to be put on the “wait for cpu” queue. Thus there’s no way for an oracle process to account how much it’s going to wait for available cpu. On the contrary, when an oracle process is executing a system call (for example, perform a disk read), it will first issue a system call to get the current time of day, execute the disk read, and when acknowledged the disk read is performed, it can issue another system call to get the time of day, thus it can account how much time it had to wait for the disk read to complete. I think this is why an oracle process is not able to post such “wait for cpu” event.

  36. Eric_DL
    | #35

    @Olivier
    Thanks for the information. This was actually one of the puzzling point for me : how can Oracle know one of its processes is waiting in CPU run queue when the process itself isn’t aware of it (plus the sampling precision difference between ASH and counters). Now I understand that Oracle knows only the actual CPU cycles consumed, and the difference of sampling precision is accepted as a not-perfect-but-best-so-far solution.

    This subject has brought me far from my starting point, and I guess I must have almost p.ssed off Kyle sometimes 😉
    Thanks to him again for being really patient with me.

    However, my goal is still the same : retrieve metrics giving an overview of database load for my client’s critical app, and graph these metrics on a web portal, visible by top managers of the application.
    So I’ve got to have charts as precise as possible, as they will sometimes serve as a base for first level crisis-management decisions, impacting users all over the globe.
    Hence my insisting on the fact that AAS “components” (CPU used, wait times and wait for CPU) stack together and give a sum close to the “global AAS”.

    For instance, I still don’t know if that famous “Wait for CPU” time will stack-up nicely with respect to global AAS value, because Oracle’s graphical tools don’t include that data, and the sum of all “components” seems nevertheless consistent with the global AAS value !

    Still browsing Oracle’s documentation, I just found two other dynamic metrics views, V$RSRC_SESSION_INFO and V$OSSTAT, so I guess I’ll figure out for myself.

    BTW, do you know of a document published by Oracle somewhere, listing new and deprecated views (and stats/metrics), because their number is just huge, and content quasi-redundant for many of them ?

  37. khailey
    | #36

    Good discussion Erci_DL and Oliver. Added some clarifying text to the beginning of the blog post.

  38. Olivier
    | #37

    Thanks kyle, very nice update to original post, making it an absolute must read !

  39. Eric_DL
    | #38

    Hi Kyle;

    Thanks for the blog post update, makes things really clear :-)

    Just a small suggestion to enhance your SQL script here.

    I think there’s no need to compute ‘CPU_OS’ by multiplying ‘Host CPU Utilization (%)’ and ‘cpu_count’, you have the ‘Host CPU Usage Per Sec’ metric inside V$SYSMETRIC view, that already gives you that value.

    Eric

  40. Olivier
    | #39

    @Eric_DL
    CPU_OS is expressed in AAS unit.
    So the only way to compute it is to multiply the number of physical core by the percent of usage.

  41. Eric_DL
    | #40

    @Olivier
    Hi Olivier,

    I know values must be AAS-normalized, but it’s actually the case.
    In fact, most time-based metrics provided by Oracle with a name like ‘…. Per Sec’ are AAS-ready as their unit is generally ‘Centiseconds Per Second’ (i.e. 100 * AAS-normalized value).

    Well, at least, that’s what I see on my 11.2 database 😉

  42. Eric_DL
    | #41

    Hi Kyle,

    Do you know if V$OSSTAT.RSRC_MGR_CPU_WAIT_TIME stat, cumulates the same “Wait For CPU” time that you derive from ASH ?

    From its description (“Time (centi-secs) processes spent in the runnable state waiting”), I’d say it looks the same, but I don’t exactly know the function of Oracle’s Resource Manager.

    Thanks
    Eric

  43. khailey
    | #42

    If V$OSSTAT.RSRC_MGR_CPU_WAIT_TIME is a an Oracle resource manager stat it would make sense that it’s wait time for CPU imposed by Oracle’s resource manager and not wait time for CPU due to OS contention for the CPU.

  44. | #43

    Hello Kyle,

    I modified a little bit the SQL to provide the “SCHEDULER” wait class (as it could be useful to see the cpu caging in action):

    To do so, I replaced:


    )
    select
    to_char(BEGIN_TIME,’HH:MI:SS’) BEGIN_TIME,
    to_char(END_TIME,’HH:MI:SS’) END_TIME,
    CPU_OS CPU_TOTAL,
    decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS – CPU_ORA_CONSUMED )) CPU_OS,
    CPU_ORA_CONSUMED CPU_ORA,
    decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND – CPU_ORA_CONSUMED )) CPU_ORA_WAIT,
    COMMIT,
    READIO,
    WAIT
    — ,( decode(sign(CPU_OS – CPU_ORA_CONSUMED), -1, 0,
    — (CPU_OS – CPU_ORA_CONSUMED))
    — + CPU_ORA_CONSUMED +
    — decode(sign(CPU_ORA_DEMAND – CPU_ORA_CONSUMED), -1, 0,
    — (CPU_ORA_DEMAND – CPU_ORA_CONSUMED ))) STACKED_CPU_TOTAL
    from (
    select
    min(BEGIN_TIME) BEGIN_TIME,
    max(END_TIME) END_TIME,
    sum(decode(CLASS,’CPU_ORA_CONSUMED’,AAS,0)) CPU_ORA_CONSUMED,
    sum(decode(CLASS,’CPU_ORA_DEMAND’ ,AAS,0)) CPU_ORA_DEMAND,
    sum(decode(CLASS,’CPU_OS’ ,AAS,0)) CPU_OS,
    sum(decode(CLASS,’Commit’ ,AAS,0)) COMMIT,
    sum(decode(CLASS,’User I/O’ ,AAS,0)) READIO,
    sum(decode(CLASS,’Wait’ ,AAS,0)) WAIT
    from AASSTAT)

    by


    union
    select
    ‘SCHEDULER’ CLASS,
    nvl(round( sum(decode(session_state,’WAITING’,1,0))/60,2),0) AAS,
    cast(min(SAMPLE_TIME) as date) BEGIN_TIME ,
    cast(max(SAMPLE_TIME) as date) END_TIME
    from v$active_session_history ash
    where SAMPLE_TIME >= (select BEGIN_TIME from v$sysmetric where metric_name=’CPU Usage Per Sec’ and group_id=2 )
    and SAMPLE_TIME < (select END_TIME from v$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2 )
    and wait_class='Scheduler'
    )
    select
    to_char(BEGIN_TIME,'HH:MI:SS') BEGIN_TIME,
    to_char(END_TIME,'HH:MI:SS') END_TIME,
    CPU_OS CPU_TOTAL,
    decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS – CPU_ORA_CONSUMED )) CPU_OS,
    CPU_ORA_CONSUMED CPU_ORA,
    decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND – CPU_ORA_CONSUMED )) CPU_ORA_WAIT,
    SCHEDULER,
    COMMIT,
    READIO,
    WAIT
    — ,( decode(sign(CPU_OS – CPU_ORA_CONSUMED), -1, 0,
    — (CPU_OS – CPU_ORA_CONSUMED))
    — + CPU_ORA_CONSUMED +
    — decode(sign(CPU_ORA_DEMAND – CPU_ORA_CONSUMED), -1, 0,
    — (CPU_ORA_DEMAND – CPU_ORA_CONSUMED ))) STACKED_CPU_TOTAL
    from (
    select
    min(BEGIN_TIME) BEGIN_TIME,
    max(END_TIME) END_TIME,
    sum(decode(CLASS,'CPU_ORA_CONSUMED',AAS,0)) CPU_ORA_CONSUMED,
    sum(decode(CLASS,'CPU_ORA_DEMAND' ,AAS,0)) CPU_ORA_DEMAND,
    sum(decode(CLASS,'CPU_OS' ,AAS,0)) CPU_OS,
    sum(decode(CLASS,'SCHEDULER' ,AAS,0)) SCHEDULER,
    sum(decode(CLASS,'Commit' ,AAS,0)) COMMIT,
    sum(decode(CLASS,'User I/O' ,AAS,0)) READIO,
    sum(decode(CLASS,'Wait' ,AAS,0)) WAIT
    from AASSTAT)
    "

    Bertrand

  45. Jared
    | #44

    So I wanted a version of this that would work with gv$.
    Please take a look at see if I have made any egregious mistakes.

    with AASSTAT as (
    select
    decode(n.wait_class,’User I/O’,’User I/O’,
    ‘Commit’,’Commit’,
    ‘Wait’) CLASS,
    sum(round(m.time_waited/m.INTSIZE_CSEC,3)) AAS,
    BEGIN_TIME ,
    END_TIME
    from gv$waitclassmetric m,
    gv$system_wait_class n
    where m.wait_class_id=n.wait_class_id
    and n.wait_class != ‘Idle’
    and n.inst_id = m.inst_id
    group by decode(n.wait_class,’User I/O’,’User I/O’, ‘Commit’,’Commit’, ‘Wait’), BEGIN_TIME, END_TIME
    union
    select ‘CPU_ORA_CONSUMED’ CLASS,
    round(sum(value)/100,3) AAS,
    min(BEGIN_TIME) begin_time,
    max(END_TIME) end_time
    from gv$sysmetric
    where metric_name=’CPU Usage Per Sec’
    and group_id=2
    group by metric_name
    union
    select ‘CPU_OS’ CLASS ,
    round((prcnt.busy*parameter.cpu_count)/100,3) AAS,
    BEGIN_TIME ,
    END_TIME
    from
    ( select sum(value) busy, min(BEGIN_TIME) begin_time,max(END_TIME) end_time from gv$sysmetric where metric_name=’Host CPU Utilization (%)’ and group_id=2 ) prcnt,
    ( select sum(value) cpu_count from gv$parameter where name=’cpu_count’ ) parameter
    union
    select
    ‘CPU_ORA_DEMAND’ CLASS,
    nvl(round( sum(decode(session_state,’ON CPU’,1,0))/60,2),0) AAS,
    cast(min(SAMPLE_TIME) as date) BEGIN_TIME ,
    cast(max(SAMPLE_TIME) as date) END_TIME
    from gv$active_session_history ash
    where SAMPLE_TIME >= (select min(BEGIN_TIME) begin_time from gv$sysmetric where metric_name=’CPU Usage Per Sec’ and group_id=2 )
    and SAMPLE_TIME < (select max(END_TIME) end_time from gv$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2 )
    )
    select
    to_char(BEGIN_TIME,'HH:MI:SS') BEGIN_TIME,
    to_char(END_TIME,'HH:MI:SS') END_TIME,
    ( decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS – CPU_ORA_CONSUMED )) +
    CPU_ORA_CONSUMED +
    decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND – CPU_ORA_CONSUMED ))) CPU_TOTAL,
    decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS – CPU_ORA_CONSUMED )) CPU_OS,
    CPU_ORA_CONSUMED CPU_ORA,
    decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND – CPU_ORA_CONSUMED )) CPU_ORA_WAIT,
    COMMIT,
    READIO,
    WAIT
    from (
    select
    min(BEGIN_TIME) BEGIN_TIME,
    max(END_TIME) END_TIME,
    sum(decode(CLASS,'CPU_ORA_CONSUMED',AAS,0)) CPU_ORA_CONSUMED,
    sum(decode(CLASS,'CPU_ORA_DEMAND' ,AAS,0)) CPU_ORA_DEMAND,
    sum(decode(CLASS,'CPU_OS' ,AAS,0)) CPU_OS,
    sum(decode(CLASS,'Commit' ,AAS,0)) COMMIT,
    sum(decode(CLASS,'User I/O' ,AAS,0)) READIO,
    sum(decode(CLASS,'Wait' ,AAS,0)) WAIT
    from AASSTAT)

  46. Jared
    | #45

    Hmm, I see a problem – times will not match between instances.
    I’m not too sure that will matter though – any thoughts?

  47. EB
    | #46

    Hi Kyle
    I am missing somepoint why you divide “Cpu Usage per sec” to 100 I know that metric is already centisecond ?

    Best Regards

  48. khailey
    | #47

    Hi EB,

    I want the value in seconds. CPU seconds per second is the same as # of uses on the CPU (on average). If I have 100 centi-seconds of CPU burned in 1 second that is equivalent to (100 centi-seconds CPU)/ 100 centi-seconds per second ) = 1 second of CPU time, i.e. 1 users on CPU (on average = could be 1 or many users in the 1 second of elapsed time).

    – Kyle

  49. EB
    | #48

    Hi Kyle
    if you want value in second why do you calculate waits in centisecond ?(m.time_waited/m.INTSIZE_CSEC) is in centi-second ?

    Best Regards

  50. JDw
    | #49

    Hi Kyle,

    I really didn’t understand if the CPU_TOTAL, CPU_OS, CPU_ORA and so on columns are expressed in porcentage of CPU utilization or AAS (average active sessions) or units like i.e if I get 1 on some column, this means = 100% of one core utilization ? If this last one is correct , I can’t understand.

    I have a 80 cpu’s (4 sockets 10-core and hyperthreading ) and my actual utilization is about 3-4% in the OS view.

    If the last one is right, are you saying I have 3 core’s on the 100% utilization ?

    Could you give an example using the below information ?

    BEGIN_TIME END_TIME CPU_TOTAL CPU_OS CPU_ORA CPU_ORA_WAIT COMMIT READIO WAIT
    ——————– ——————– ———- ———- ———- ———— ———- ———- ———-
    04:09:48 04:10:48 3.347 1.724 1.623 0 .243 .332 .363

    Thank you

  51. khailey
    | #50

    >> are you saying I have 3 core’s on the 100% utilization ?

    no – it means that you are burning the equivalent to 3.347 (CPU_OS + CPU_ORA) CPUs at a 100%. Now I have no idea how that load is spread across the CPUs. A couple of CPUs could be at 100% if a process is pegged on it, but could also be spread across the CPUs evening. That’s not the point here. The point here is to know how much CPU we are using at the OS level and at the Oracle level and to know if Oracle processes have to wait for CPU.

    An AAS based only on CPU activity of 1 is equivalent to 1 CPU at 100% but if you have many CPUs this load could be spread across those CPUs.

  52. JDw
    | #51

    Hi Kyle,

    You are my super teacher…. Tks again..
    So, in this information below, I have 3.347 CPU’s (quantity) being burning at this moment (right now), in which 1.724 are being burning by S.O and 1.623 by Oracle process.

    CPU_TOTAL = 3.347
    CPU_OS = 1.724
    CPU_ORA = 1.623

    All above its about quantity of number of CPU’s being used and NOT AVERAGE ACTIVE SESSION, right ? (Because I can get more than 1 session at the same time in a CPU).

    Regards

  53. khailey
    | #52

    yes thats right
    as far as measuring in AAS it makes much much more intuitive sense than micro seconds on CPU or CPU %. Those don’t tell me much.
    AAS can consist of many parts such as waiting for I/O or locks or latches
    In this case we are just talking about CPU so AAS in this discussion is just the number of processes that want to run on CPU.
    How many actually get to run is based on actual CPU burned.

  54. Luca
    | #53

    Hi khailey,

    I changed your code , in relation the waits, to discriminate between background and foreground activity, what is your opinion in relation to this change (this “enhancement” works only with version 11++):

    set linesize 300
    column begin_time format a10

    with AASSTAT as (
    select
    decode(n.wait_class,’User I/O’,’User I/O_fg’,
    ‘Commit’,’Commit_fg’,
    ‘Wait_fg’) CLASS,
    sum(round(m.TIME_WAITED_FG/m.INTSIZE_CSEC,3)) AAS,
    BEGIN_TIME ,
    END_TIME
    from v$waitclassmetric m,
    v$system_wait_class n
    where m.wait_class_id=n.wait_class_id
    and n.wait_class != ‘Idle’
    group by decode(n.wait_class,’User I/O’,’User I/O_fg’, ‘Commit’,’Commit_fg’, ‘Wait_fg’), BEGIN_TIME, END_TIME
    union all
    select
    decode(n.wait_class,’User I/O’,’User I/O’,
    ‘Commit’,’Commit’,
    ‘Wait’) CLASS,
    sum(round(m.time_waited/m.INTSIZE_CSEC,3)) AAS,
    BEGIN_TIME ,
    END_TIME
    from v$waitclassmetric m,
    v$system_wait_class n
    where m.wait_class_id=n.wait_class_id
    and n.wait_class != ‘Idle’
    group by decode(n.wait_class,’User I/O’,’User I/O’, ‘Commit’,’Commit’, ‘Wait’), BEGIN_TIME, END_TIME
    union
    select
    decode(n.wait_class,’User I/O’,’User I/O’,
    ‘Commit’,’Commit’,
    ‘Wait’) CLASS,
    sum(round(m.time_waited/m.INTSIZE_CSEC,3)) AAS,
    BEGIN_TIME ,
    END_TIME
    from v$waitclassmetric m,
    v$system_wait_class n
    where m.wait_class_id=n.wait_class_id
    and n.wait_class != ‘Idle’
    group by decode(n.wait_class,’User I/O’,’User I/O’, ‘Commit’,’Commit’, ‘Wait’), BEGIN_TIME, END_TIME
    union
    select ‘CPU_ORA_CONSUMED’ CLASS,
    round(value/100,3) AAS,
    BEGIN_TIME ,
    END_TIME
    from v$sysmetric
    where metric_name=’CPU Usage Per Sec’
    and group_id=2
    union
    select ‘CPU_OS’ CLASS ,
    round((prcnt.busy*parameter.cpu_count)/100,3) AAS,
    BEGIN_TIME ,
    END_TIME
    from
    ( select value busy, BEGIN_TIME,END_TIME from v$sysmetric where metric_name=’Host CPU Utilization (%)’ and group_id=2 ) prcnt,
    ( select value cpu_count from v$parameter where name=’cpu_count’ ) parameter
    union
    select
    ‘CPU_ORA_DEMAND’ CLASS,
    nvl(round( sum(decode(session_state,’ON CPU’,1,0))/60,2),0) AAS,
    cast(min(SAMPLE_TIME) as date) BEGIN_TIME ,
    cast(max(SAMPLE_TIME) as date) END_TIME
    from v$active_session_history ash
    where SAMPLE_TIME >= (select BEGIN_TIME from v$sysmetric where metric_name=’CPU Usage Per Sec’ and group_id=2 )
    and SAMPLE_TIME < (select END_TIME from v$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2 )
    )
    select
    to_char(BEGIN_TIME,'HH:MI:SS') BEGIN_TIME,
    to_char(END_TIME,'HH:MI:SS') END_TIME,
    CPU_OS CPU_TOTAL,
    decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS – CPU_ORA_CONSUMED )) CPU_OS,
    CPU_ORA_CONSUMED CPU_ORA,
    decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND – CPU_ORA_CONSUMED )) CPU_ORA_WAIT,
    COMMIT as COMMIT_TOT,
    COMMIT-COMMIT_FG as COMMIT_BG ,
    COMMIT_FG,
    IO as IO_TOT,
    IO-IO_FG as IO_BG,
    IO_FG,
    WAIT as OTHERS_WAIT_TOT,
    WAIT-WAIT_FG as OTHERS_WAIT_BG,
    WAIT_FG as OTHERS_WAIT_FG
    — ,( decode(sign(CPU_OS – CPU_ORA_CONSUMED), -1, 0,
    — (CPU_OS – CPU_ORA_CONSUMED))
    — + CPU_ORA_CONSUMED +
    — decode(sign(CPU_ORA_DEMAND – CPU_ORA_CONSUMED), -1, 0,
    — (CPU_ORA_DEMAND – CPU_ORA_CONSUMED ))) STACKED_CPU_TOTAL
    from (
    select
    min(BEGIN_TIME) BEGIN_TIME,
    max(END_TIME) END_TIME,
    sum(decode(CLASS,'CPU_ORA_CONSUMED',AAS,0)) CPU_ORA_CONSUMED,
    sum(decode(CLASS,'CPU_ORA_DEMAND' ,AAS,0)) CPU_ORA_DEMAND,
    sum(decode(CLASS,'CPU_OS' ,AAS,0)) CPU_OS,
    sum(decode(CLASS,'Commit' ,AAS,0)) COMMIT,
    sum(decode(CLASS,'User I/O' ,AAS,0)) IO,
    sum(decode(CLASS,'Wait' ,AAS,0)) WAIT,
    sum(decode(CLASS,'Commit_fg' ,AAS,0)) COMMIT_FG,
    sum(decode(CLASS,'User I/O_fg' ,AAS,0)) IO_FG,
    sum(decode(CLASS,'Wait_fg' ,AAS,0)) WAIT_FG
    from AASSTAT)
    /

    Thanks in advance.

    Luca.

  55. Luca
    | #54

    Hi,

    sorry add one portion of code wrong ,this is “correct” query (whitout duplicate select for TOTAL WAITS ):

    set linesize 300
    column begin_time format a10

    with AASSTAT as (
    select
    decode(n.wait_class,’User I/O’,’User I/O_fg’,
    ‘Commit’,’Commit_fg’,
    ‘Wait_fg’) CLASS,
    sum(round(m.TIME_WAITED_FG/m.INTSIZE_CSEC,3)) AAS,
    BEGIN_TIME ,
    END_TIME
    from v$waitclassmetric m,
    v$system_wait_class n
    where m.wait_class_id=n.wait_class_id
    and n.wait_class != ‘Idle’
    group by decode(n.wait_class,’User I/O’,’User I/O_fg’, ‘Commit’,’Commit_fg’, ‘Wait_fg’), BEGIN_TIME, END_TIME
    union all
    select
    decode(n.wait_class,’User I/O’,’User I/O’,
    ‘Commit’,’Commit’,
    ‘Wait’) CLASS,
    sum(round(m.time_waited/m.INTSIZE_CSEC,3)) AAS,
    BEGIN_TIME ,
    END_TIME
    from v$waitclassmetric m,
    v$system_wait_class n
    where m.wait_class_id=n.wait_class_id
    and n.wait_class != ‘Idle’
    group by decode(n.wait_class,’User I/O’,’User I/O’, ‘Commit’,’Commit’, ‘Wait’), BEGIN_TIME, END_TIME
    union
    select ‘CPU_ORA_CONSUMED’ CLASS,
    round(value/100,3) AAS,
    BEGIN_TIME ,
    END_TIME
    from v$sysmetric
    where metric_name=’CPU Usage Per Sec’
    and group_id=2
    union
    select ‘CPU_OS’ CLASS ,
    round((prcnt.busy*parameter.cpu_count)/100,3) AAS,
    BEGIN_TIME ,
    END_TIME
    from
    ( select value busy, BEGIN_TIME,END_TIME from v$sysmetric where metric_name=’Host CPU Utilization (%)’ and group_id=2 ) prcnt,
    ( select value cpu_count from v$parameter where name=’cpu_count’ ) parameter
    union
    select
    ‘CPU_ORA_DEMAND’ CLASS,
    nvl(round( sum(decode(session_state,’ON CPU’,1,0))/60,2),0) AAS,
    cast(min(SAMPLE_TIME) as date) BEGIN_TIME ,
    cast(max(SAMPLE_TIME) as date) END_TIME
    from v$active_session_history ash
    where SAMPLE_TIME >= (select BEGIN_TIME from v$sysmetric where metric_name=’CPU Usage Per Sec’ and group_id=2 )
    and SAMPLE_TIME < (select END_TIME from v$sysmetric where metric_name='CPU Usage Per Sec' and group_id=2 )
    )
    select
    to_char(BEGIN_TIME,'HH:MI:SS') BEGIN_TIME,
    to_char(END_TIME,'HH:MI:SS') END_TIME,
    CPU_OS CPU_TOTAL,
    decode(sign(CPU_OS-CPU_ORA_CONSUMED), -1, 0, (CPU_OS – CPU_ORA_CONSUMED )) CPU_OS,
    CPU_ORA_CONSUMED CPU_ORA,
    decode(sign(CPU_ORA_DEMAND-CPU_ORA_CONSUMED), -1, 0, (CPU_ORA_DEMAND – CPU_ORA_CONSUMED )) CPU_ORA_WAIT,
    COMMIT as COMMIT_TOT,
    COMMIT-COMMIT_FG as COMMIT_BG ,
    COMMIT_FG,
    IO as IO_TOT,
    IO-IO_FG as IO_BG,
    IO_FG,
    WAIT as OTHERS_WAIT_TOT,
    WAIT-WAIT_FG as OTHERS_WAIT_BG,
    WAIT_FG as OTHERS_WAIT_FG
    — ,( decode(sign(CPU_OS – CPU_ORA_CONSUMED), -1, 0,
    — (CPU_OS – CPU_ORA_CONSUMED))
    — + CPU_ORA_CONSUMED +
    — decode(sign(CPU_ORA_DEMAND – CPU_ORA_CONSUMED), -1, 0,
    — (CPU_ORA_DEMAND – CPU_ORA_CONSUMED ))) STACKED_CPU_TOTAL
    from (
    select
    min(BEGIN_TIME) BEGIN_TIME,
    max(END_TIME) END_TIME,
    sum(decode(CLASS,'CPU_ORA_CONSUMED',AAS,0)) CPU_ORA_CONSUMED,
    sum(decode(CLASS,'CPU_ORA_DEMAND' ,AAS,0)) CPU_ORA_DEMAND,
    sum(decode(CLASS,'CPU_OS' ,AAS,0)) CPU_OS,
    sum(decode(CLASS,'Commit' ,AAS,0)) COMMIT,
    sum(decode(CLASS,'User I/O' ,AAS,0)) IO,
    sum(decode(CLASS,'Wait' ,AAS,0)) WAIT,
    sum(decode(CLASS,'Commit_fg' ,AAS,0)) COMMIT_FG,
    sum(decode(CLASS,'User I/O_fg' ,AAS,0)) IO_FG,
    sum(decode(CLASS,'Wait_fg' ,AAS,0)) WAIT_FG
    from AASSTAT)
    /


− 1 = six