MySQL waits and CPU

March 23rd, 2022

Question: how do I know that one user is waiting for a lock and another  user is burning CPU on MySQL?

i.e. how can I create active session history on MySQL like on Oracle?

Two simple cases I’m looking at
  1. session waiting on lock
    1. The wait event is ” wait/io/table/sql/handler” which happens for a number of potential reasons
    2. -> I can join to information_schema views INNODB_LOCK_WAITS & INNODB_TRX but this is heavy, and I’d rather only access those after I know there is a lock waiter.
  2. session burning CPU
    1. How do I know from performance schema views that the connection is active and burning CPU?
    2. It has a wait event that shows up wait/synch/mutex/sql/THD::LOCK_thd_data
    3. on Oracle to determine CPU, we just look at connections that are active and not waiting.
Also what is NESTING_EVENT_TYPE?  I see the same thread sometime twice in events_waits_current with NESTING_EVENT_TYPE STAGE and WAIT.
On some systems for the lock test below the NESTING_EVENT_TYPE shows up as NULL and others as STAGE.

 

I basically want to monitor user activity and collect who is active and if they are waiting for something or if they are on CPU.

Row Lock

First user
             start transaction;
             update toto set id=1;
Second user does
             update toto set id=4; -- hangs

How do I see second user is waiting for a row lock? Here is output from

  • performance_schema.threads
  • performance_schema.events_waits_current

For a connection waiting on a lock:

select * from performance_schema.threads  where thread_id=5329\G
           THREAD_ID: 5329
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 5223
   PROCESSLIST_USER: root
   PROCESSLIST_HOST: localhost
     PROCESSLIST_DB: kyle
PROCESSLIST_COMMAND: Query
   PROCESSLIST_TIME: 782
  PROCESSLIST_STATE: updating
   PROCESSLIST_INFO: update toto set id=4
   PARENT_THREAD_ID: NULL
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: SSL/TLS
       THREAD_OS_ID: 2088
     RESOURCE_GROUP: USR_default
     
select * from performance_schema.events_waits_current  where thread_id=5329\G
*************************** 1. row ***************************
            THREAD_ID: 5329
             EVENT_ID: 740239187
         END_EVENT_ID: NULL
           EVENT_NAME: wait/io/table/sql/handler
               SOURCE: handler.cc:2991
          TIMER_START: 45606173945401364
            TIMER_END: 46413477812683064
           TIMER_WAIT: 807303867281700
                SPINS: NULL
        OBJECT_SCHEMA: kyle
          OBJECT_NAME: toto
           INDEX_NAME: NULL
          OBJECT_TYPE: TABLE
OBJECT_INSTANCE_BEGIN: 139872327141152
     NESTING_EVENT_ID: 740239186
   NESTING_EVENT_TYPE: STAGE
            OPERATION: fetch
      NUMBER_OF_BYTES: 61

CPU

Have user running a procedure that just burns CPU running “call minute_rollup(100000000)”.  Here is output from
  • performance_schema.threads
  • performance_schema.events_waits_current
 select * from performance_schema.threads where thread_id= 6273  \G
 *************************** 1. row ***************************
          THREAD_ID: 6273
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 6167
   PROCESSLIST_USER: root
   PROCESSLIST_HOST: localhost
     PROCESSLIST_DB: kyle
PROCESSLIST_COMMAND: Query
   PROCESSLIST_TIME: 668
  PROCESSLIST_STATE: Opening tables
   PROCESSLIST_INFO: call minute_rollup(100000000)
   PARENT_THREAD_ID: NULL
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: SSL/TLS
       THREAD_OS_ID: 2087
     RESOURCE_GROUP: USR_default
     
    select * from performance_schema.events_waits_current  where thread_id=6273\G
             THREAD_ID: 6273
             EVENT_ID: 403685579
         END_EVENT_ID: 403685579
           EVENT_NAME: wait/synch/mutex/sql/THD::LOCK_thd_data
               SOURCE: sql_class.h:4142
          TIMER_START: 46300974764007954
            TIMER_END: 46300974764070510
           TIMER_WAIT: 62556
                SPINS: NULL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
           INDEX_NAME: NULL
          OBJECT_TYPE: NULL
OBJECT_INSTANCE_BEGIN: 139871988613512
     NESTING_EVENT_ID: 403685578
   NESTING_EVENT_TYPE: STATEMENT
            OPERATION: lock
      NUMBER_OF_BYTES: NULL
                FLAGS: NULL

Procedure to burn CPU:

DELIMITER $$
DROP PROCEDURE IF EXISTS minute_rollup$$
CREATE PROCEDURE minute_rollup(input_number INT)
BEGIN
DECLARE counter int;
DECLARE out_number float;
set counter=0;
WHILE counter <= input_number DO
SET out_number=SQRT(rand());
SET counter = counter + 1;
END WHILE;
END$$
DELIMITER ;

call minute_rollup(100000000);

I think the trick to see if the connection is on CPU is performance_schema.events_waits_current.EVENT_ID= END_EVENT_ID then we are on CPU

 

select
       t.thread_id,
       IF(w.EVENT_ID=w.END_EVENT_ID,'CPU', w.EVENT_NAME) AS wait_event,
       REPLACE( REPLACE(substr(t.PROCESSLIST_info ,1,80), '\r', '') , '\n','') sql_text 
from
           performance_schema.threads t
 left join performance_schema.events_waits_current w on w.thread_id = t.thread_id
where 
        t.PROCESSLIST_STATE IS NOT NULL
 and ( t.PROCESSLIST_COMMAND != 'Sleep' and t.PROCESSLIST_COMMAND != 'Daemon')
 and t.PROCESSLIST_ID != CONNECTION_ID();

+-----------+-------------------------------------------+----------------------------------------------------------------------------------+
| thread_id | wait_event                                | sql_text                                                                         |
+-----------+-------------------------------------------+----------------------------------------------------------------------------------+
| 518002247 | CPU                                       | SELECT anon_1.anon_2_anon_3_customer_id_ AS anon_1_anon_2_anon_3_customer_id_, a |
| 518007382 | CPU                                       | COMMIT                                                                           |
| 518006817 | wait/io/table/sql/handler                 | SELECT anon_1.anon_2_customer_id_ AS anon_1_anon_2_customer_id_, anon_1.anon_2_c |
| 518007025 | wait/io/table/sql/handler                 | update customer            set name=name            where id_ in (2, 1);         |
| 518007027 | wait/synch/cond/sql/Item_func_sleep::cond | select sleep(20)                                                                 |
| 518007028 | wait/io/table/sql/handler                 | update customer            set name=name            where id_ in (8, 4);         |
| 518007029 | wait/io/table/sql/handler                 | update customer            set name=name            where id_ in (6, 3);         |
| 518007032 | wait/synch/cond/sql/Item_func_sleep::cond | select sleep(20)                                                                 |
| 518007034 | wait/io/table/sql/handler                 | SELECT anon_1.anon_2_customer_id_ AS anon_1_anon_2_customer_id_, anon_1.anon_2_c |
| 518007035 | wait/synch/cond/sql/Item_func_sleep::cond | select sleep(20)                                                                 |
| 518007039 | wait/io/table/sql/handler                 | update customer            set name=name            where id_ in (5, 3);         |
| 518005220 | wait/io/table/sql/handler                 | SELECT anon_1.customer_id_ AS anon_1_customer_id_, anon_1.customer_name AS anon_ |
| 518007058 | wait/io/table/sql/handler                 | update customer            set name=name            where id_ in (4, 2);         |
| 518004406 | wait/io/table/sql/handler                 | SELECT anon_1.anon_2_customer_id_ AS anon_1_anon_2_customer_id_, anon_1.anon_2_c |
| 518003240 | wait/io/table/sql/handler                 | update toto set id=4                                                             |
+-----------+-------------------------------------------+----------------------------------------------------------------------------------+
 

One unusual thing is that the thread can show up more than once in events_waits_current.

 select
    t.thread_id,
    w.NESTING_EVENT_TYPE,
    IF(w.EVENT_ID=w.END_EVENT_ID,'CPU', w.EVENT_NAME)  AS wait_event,
    REPLACE(    REPLACE(substr(t.PROCESSLIST_info ,1,80), '\r', '') , '\n','') sql_text 
from
               performance_schema.threads t
    left join  performance_schema.events_waits_current w on w.thread_id = t.thread_id
where 
        t.PROCESSLIST_STATE IS NOT NULL
  and ( t.PROCESSLIST_COMMAND != 'Sleep' and t.PROCESSLIST_COMMAND != 'Daemon')
  and   t.PROCESSLIST_ID != CONNECTION_ID()

+-----------+--------------------+-------------------------------------------+----------------------------------------------------------------------------------+
| thread_id | NESTING_EVENT_TYPE | wait_event                                | sql_text                                                                         |
+-----------+--------------------+-------------------------------------------+----------------------------------------------------------------------------------+
| 518010010 | STATEMENT          | wait/io/table/sql/handler                 | SELECT customer.id_ AS customer_id_, customer.name AS customer_name, customer.de |
| 518010010 | WAIT               | wait/io/file/sql/hash_join                | SELECT customer.id_ AS customer_id_, customer.name AS customer_name, customer.de |
| 518010011 | STATEMENT          | wait/io/table/sql/handler                 | SELECT customer.id_ AS customer_id_, customer.name AS customer_name, customer.de |
| 518010011 | WAIT               | wait/io/file/sql/hash_join                | SELECT customer.id_ AS customer_id_, customer.name AS customer_name, customer.de |

In the case of NESTING_EVENT_TYPE of type WAIT, I want that WAIT. IF there is no NESTING_EVENT_TYPE of type WAIT and only of type STATEMENT , then I want that wait (AFAICT) leading to a bit of a convoluted statement

select
    t.thread_id,
    COALESCE( ww.NESTING_EVENT_TYPE, ws.NESTING_EVENT_TYPE) NESTING_EVENT_TYPE,
    IF(ws.EVENT_ID=ws.END_EVENT_ID,'CPU',
       COALESCE( ww.EVENT_NAME, ws.EVENT_NAME) )  AS wait_event,
    REPLACE(REPLACE(substr(t.PROCESSLIST_info,1,80),'\r',''),'\n','') sql_text 
from
                performance_schema.threads t
    left   join  performance_schema.events_waits_current ww on ww.thread_id = t.thread_id
         and ww.NESTING_EVENT_TYPE = 'WAIT'
    left   join  performance_schema.events_waits_current ws on ws.thread_id = t.thread_id
            and ws.NESTING_EVENT_TYPE = 'STATEMENT'
where 
        t.PROCESSLIST_STATE IS NOT NULL
  and ( t.PROCESSLIST_COMMAND != 'Sleep' and t.PROCESSLIST_COMMAND != 'Daemon')
  and   t.PROCESSLIST_ID != CONNECTION_ID();
+-----------+--------------------+-------------------------------------------+----------------------------------------------------------------------------------+
| thread_id | NESTING_EVENT_TYPE | wait_event                                | sql_text                                                                         |
+-----------+--------------------+-------------------------------------------+----------------------------------------------------------------------------------+
| 518009610 | STATEMENT          | wait/io/table/sql/handler                 | SELECT anon_1.anon_2_anon_3_customer_id_ AS anon_1_anon_2_anon_3_customer_id_, a |
| 518010002 | STATEMENT          | CPU                                       | SELECT current_schema, sql_text, digest, digest_text, timer_start, @startup_time |
| 518003240 | STATEMENT          | wait/io/table/sql/handler                 | update toto set id=4                                                             |
| 518010412 | STATEMENT          | wait/io/table/sql/handler                 | SELECT anon_1.anon_2_customer_id_ AS anon_1_anon_2_customer_id_, anon_1.anon_2_c |
| 518010010 | WAIT               | wait/io/file/sql/hash_join                | SELECT customer.id_ AS customer_id_, customer.name AS customer_name, customer.de |
| 518010417 | WAIT               | wait/io/file/sql/hash_join                | SELECT anon_1.customer_id_ AS anon_1_customer_id_, anon_1.customer_name AS anon_ |
+-----------+--------------------+-------------------------------------------+----------------------------------------------------------------------------------+

It gets more confusing because in some cases there are multiple lines (only seen two, not more ) for the same thread with the same NESTING_EVENT_TYPE ( NULL in the below example)

SELECT
        threads.PROCESSLIST_ID,
         threads.PROCESSLIST_USER,
         threads.PROCESSLIST_HOST,
         threads.PROCESSLIST_COMMAND,
         threads.PROCESSLIST_STATE,
         current_events.TIMER_START,
         current_events.TIMER_END,
         current_events.TIMER_WAIT,
         current_events.NESTING_EVENT_TYPE,
         REPLACE(REPLACE(substr(current_events.SQL_TEXT,1,40),'\r',''),'\n','')  stmt_sql_text_short,
         REPLACE(REPLACE(substr(threads.PROCESSLIST_info,1,40),'\r',''),'\n','') thread_proc_info_short,
         current_events.CURRENT_SCHEMA,
         IF ( threads.PROCESSLIST_STATE ='User sleep','User sleep', (
            IF(current_waits.EVENT_ID=current_waits.END_EVENT_ID,'CPU', current_waits.EVENT_NAME)
          )) AS wait_event
 FROM
                    performance_schema.threads AS threads
         INNER JOIN performance_schema.events_statements_current AS current_events ON threads.THREAD_ID = current_events.THREAD_ID
         INNER JOIN performance_schema.events_waits_current AS current_waits ON threads.THREAD_ID = current_waits.THREAD_ID
  Where threads.PROCESSLIST_STATE is not NULL


+------+-----------+-------------+-----------------+--------------------+---------------------------------------------------------------------------------------------+
| PID  | PLIST_CMD | PLIST_STATE |  TIMER_WAIT     | NESTING_EVENT_TYPE |  thread_proc_info_short                   | wait_event                                       |
+------+-----------+-------------+-----------------+--------------------+---------------------------------------------------------------------------------------------+
| 8539 | Query     | executing   |    445524154000 | NULL               |  INSERT INTO order_status_change (dbm_ord | wait/io/table/sql/handler                        |
| 8539 | Query     | executing   |    445580677000 | NULL               |  INSERT INTO order_status_change (dbm_ord | CPU                                              |

| 8820 | Query     | init        |       170936000 | NULL               |  SELECT * from dbm_user WHERE id = 2404   | idle                                             |
| 8820 | Query     | init        |       170936000 | NULL               |  SELECT * from dbm_user WHERE id = 2404   | wait/io/socket/sql/client_connection             |

| 8730 | Query     | executing   |    595809657000 | NULL               |  INSERT INTO order_status_change (dbm_ord | wait/io/table/sql/handler                        |
| 8730 | Query     | executing   |    595869870000 | NULL               |  INSERT INTO order_status_change (dbm_ord | wait/synch/sxlock/innodb/lock_sys_global_rw_lock |


 


Uncategorized

  1. Trackbacks

  2. No trackbacks yet.
  1. Comments

  2. | #1

    Hey Kyle,

    I’ll try and answer your questions in sections.

    InnoDB Lock Waits
    =================

    MySQL 8.0 added `performance_schema.data_locks` and `performance_schema.data_lock_waits` instead of the I_S views (which should be totally forgotten now, as you note they are not the best for performance).

    https://dev.mysql.com/doc/refman/8.0/en/performance-schema-data-lock-waits-table.html

    Layered on top of this we added the `sys.innodb_lock_waits` view:

    https://dev.mysql.com/doc/refman/8.0/en/sys-innodb-lock-waits.html

    With your example, if you use that view you’ll see something like this (I only added one row in to this `toto`.`toto` table):

    mysql> select * from sys.innodb_lock_waits\G
    *************************** 1. row ***************************
    wait_started: 2022-03-24 06:58:07
    wait_age: 00:00:20
    wait_age_secs: 20
    locked_table: `toto`.`toto`
    locked_table_schema: toto
    locked_table_name: toto
    locked_table_partition: NULL
    locked_table_subpartition: NULL
    locked_index: PRIMARY
    locked_type: RECORD
    waiting_trx_id: 11613
    waiting_trx_started: 2022-03-24 06:58:07
    waiting_trx_age: 00:00:20
    waiting_trx_rows_locked: 1
    waiting_trx_rows_modified: 0
    waiting_pid: 2432
    waiting_query: update toto set id=4
    waiting_lock_id: 339040616:2:4:2:343253280
    waiting_lock_mode: X
    blocking_trx_id: 11607
    blocking_pid: 2431
    blocking_query: NULL
    blocking_lock_id: 339039808:2:4:2:343250464
    blocking_lock_mode: X
    blocking_trx_started: 2022-03-24 06:57:54
    blocking_trx_age: 00:00:33
    blocking_trx_rows_locked: 2
    blocking_trx_rows_modified: 0
    sql_kill_blocking_query: KILL QUERY 2431
    sql_kill_blocking_connection: KILL 2431
    1 row in set (0.18 sec)

    Sessions On CPU
    ===============

    Yea this has been something that has been missing in Performance Schema since inception, we show everywhere we are waiting and doing work, but not really the overall CPU usage.

    Luckily, that changed with 8.0.28, and a contribution from Facebook that we extended:

    https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-28.html#mysqld-8-0-28-performance-schema

    –snip–
    A new statement metric, CPU_TIME, is now available, enabling you to measure the CPU time spent on a query.

    The following changes were made to support this:

    A timer, THREAD_CPU, was added to the Performance Schema PERFORMANCE_TIMERS table.

    A consumer, events_statements_cpu was added to the Performance Schema setup_consumers table.

    events_statements_cpu is disabled by default.

    A Performance Schema command option, performance-schema-consumer-events-statements-cpu to enable or disable the consumer, events_statements_cpu.

    See Performance Schema Command Options for more information.

    The following columns were added:

    CPU_TIME column was added to the following Performance Schema tables:

    events_statements_current

    events_statements_history

    events_statements_history_long

    CPU_TIME is the time spent on CPU for the current thread, expressed in picoseconds.

    SUM_CPU_TIME column was added to the following Performance Schema tables:

    events_statements_summary_by_thread_by_event_name

    events_statements_summary_by_account_by_event_name

    events_statements_summary_by_user_by_event_name

    events_statements_summary_by_host_by_event_name

    events_statements_summary_global_by_event_name

    events_statements_summary_by_digest

    events_statements_summary_by_program

    prepared_statements_instances

    SUM_CPU_TIME is the CPU time spent on the current thread, expressed in picoseconds, for the corresponding aggregations.

    CPU_LATENCY column was added to the following sys schema tables:

    statement_analysis

    user_summary_by_statement_type

    user_summary_by_statement_latency

    host_summary_by_statement_type

    host_summary_by_statement_latency

    processlist

    session

    CPU latency is the CPU time spent on the current thread, expressed in human-readable form.

    CPU_LATENCY column was added to the following sys schema x$ tables:

    x$statement_analysis

    x$user_summary_by_statement_type

    x$host_summary_by_statement_type

    x$processlist

    x$session

    x$user_summary_by_statement_latency

    x$host_summary_by_statement_latency

    CPU latency is the CPU time spent on the current thread, expressed in picoseconds.

    Our thanks to Facebook for their contribution to this fix. (Bug #32202060, Bug #101764)

    –snip–

    This should now easily show you those burning CPU. It’s only available at the very tip of MySQL versions, however.

    Event Nesting
    =============

    I’ve written about this before, so I’m just going to point you there instead, rather than duplicate it all here:

    http://www.markleith.co.uk/2011/04/01/a-big-bag-of-epic-awesomeness/

    The ability to generate those “traces” (they are actually `dot` files) is also now built in to MySQL using the `sys.ps_trace_thread` routine:

    https://dev.mysql.com/doc/refman/8.0/en/sys-ps-trace-thread.html

  3. khailey
    | #2

    Thanks Mark – appreciate all the feedback.
    Testing on 8.0.19 , data_locks and data_lock_waits look great and perform.
    The view sys.innodb_lock_waits takes about 30 seconds on a super loaded system but on the same system data_locks and data_lock_waits both take under 100ms
    So for 8.0 I’ll use data_lock_waits to find if someone is blocked.
    Flor 5.6 & 5.7 I’ll use information_schema.INNODB_TRX and information_schema.innodb_lock_waits directly as it’s faster than using the view sys.innodb_lock_waits

  4. khailey
    | #3

    The CPU stuff looks cool. I don’t yet see it on 8.0.19 but do see it on 8.0.26


+ four = 9