Archive

Archive for the ‘Uncategorized’ Category

CLI for Amazon RDS Performance Insights

December 11th, 2018

Installing CLI on LINUX

1. install PIP

https://docs.aws.amazon.com/cli/latest/userguide/awscli-install-linux.html#awscli-install-linux-pip

curl -O https://bootstrap.pypa.io/get-pip.py
python get-pip.py --user

2. install AWS CLI

https://docs.aws.amazon.com/cli/latest/userguide/installing.html

pip install awscli --upgrade --user

3. configure

aws configure

For “aws configure” you will need

  • AWS Access Key ID:
  • AWS Secret Access Key:

Which you can get by going to the AWS console, going to IMS and creating access key.

Running example

Once “aws” is configured you can run the CLI like

aws \
 pi get-resource-metrics \
 --region us-east-1 \
 --service-type RDS \
 --identifier db-xxxxxx \
 --metric-queries "{\"Metric\": \"db.load.avg\"}" \
 --start-time `expr \`date +%s\` - 60480000 ` \
 --end-time `date +%s` \
 --period-in-seconds 86400

That “—identifier” is for one of my databases, so you will have to change that.
You will also have to modify region if you are accessing a database in a different region

getting json output

export AWS_DEFAULT_OUTPUT="json"

documentation

API

CLI

examples

My databases

  • db-YTDU5J5V66X7CXSCVDFD2V3SZM ( Aurora PostgreSQL)
  • db-2XQCJDBHGIXKDYUVVOIUIJ34LU ( Aurora MySQL)
  • db-Z2PNRYPV4J7LJLGDOKMISTWRQU (RDS MySQL)

see these blogs for loads on these databases

CPU load last 5 minutes

aws \
 pi get-resource-metrics \
 --region us-east-1 \
 --service-type RDS \
 --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM \
 --start-time `expr \`date +%s\` - 300 ` \
 --metric-queries '{
      "Metric": "db.load.avg",
      "Filter":{"db.wait_event.type": "CPU"}
      } ' \
 --end-time `date +%s` \
 --period-in-seconds 300

Top SQL by load

aws pi describe-dimension-keys \
    --region us-east-1 \
    --service-type RDS \
    --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM \
    --start-time `expr \`date +%s\` - 300 ` \
    --end-time `date +%s` \
    --metric db.load.avg \
    --group-by '{"Group":"db.sql"}'
    

Top Waits by load

aws pi describe-dimension-keys \
    --region us-east-1 \
    --service-type RDS \
    --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM \
    --start-time `expr \`date +%s\` - 300 ` \
    --end-time `date +%s` \
    --metric db.load.avg \
    --group-by '{"Group":"db.wait_event"}'

.
Top User by load

aws pi describe-dimension-keys \
    --region us-east-1 \
    --service-type RDS \
    --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM \
    --start-time `expr \`date +%s\` - 300 ` \
    --end-time `date +%s` \
    --metric db.load.avg \
    --group-by '{"Group":"db.user"}'

.

"Total": 0.15100671140939598, 
            "Dimensions": {
                "db.sql.db_id": "pi-4101593903", 
                "db.sql.id": "209554B4D97DBF72871AE0854DAD97385D553BAA", 
                "db.sql.tokenized_id": "1F61DDE1D315BB8F4BF198DB219D4180BC1CFE05", 
                "db.sql.statement": "WITH cte AS (\n   SELECT id     \n   FROM   authors \n   LIMIT  1     \n   )\nUPDATE authors s\nSET    email = 'toto' \nFROM   cte\nWHERE  s.id = cte.id\n\n"
            }

Top SQL by waits grouped

aws pi describe-dimension-keys \
    --region us-east-1 \
    --service-type RDS \
    --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM \
    --start-time `expr \`date +%s\` - 300 ` \
    --end-time `date +%s` \
    --metric db.load.avg \
    --group-by '{"Group":"db.sql"}' \
    --partition-by '{"Group": "db.wait_event"}'

.

{
            "Total": 0.1644295302013423, 
            "Dimensions": {
                "db.sql.db_id": "pi-4101593903", 
                "db.sql.id": "209554B4D97DBF72871AE0854DAD97385D553BAA", 
                "db.sql.tokenized_id": "1F61DDE1D315BB8F4BF198DB219D4180BC1CFE05", 
                "db.sql.statement": "WITH cte AS (\n   SELECT id     \n   FROM   authors \n   LIMIT  1     \n   )\nUPDATE authors s\nSET    email = 'toto' \nFROM   cte\nWHERE  s.id = cte.id\n\n"
            }, 
            "Partitions": [
                0.003355704697986577, 
                0.14093959731543623, 
                0.020134228187919462
            ]
        }, 
"PartitionKeys": [
        {
            "Dimensions": {
                "db.wait_event.type": "CPU", 
                "db.wait_event.name": "CPU"
            }
        }, 
        {
            "Dimensions": {
                "db.wait_event.type": "IO", 
                "db.wait_event.name": "IO:XactSync"
            }
        }, 
        {
            "Dimensions": {
                "db.wait_event.type": "Lock", 
                "db.wait_event.name": "Lock:transactionid"
            }
        }

Top SQL over last 5 minutes based on CPU

 aws pi describe-dimension-keys     \
    --region us-east-1     \
    --service-type RDS     \
    --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM     \
    --start-time `expr \`date +%s\` - 300 `     \
    --end-time `date +%s`     \
    --metric db.load.avg     \
    --group-by '{"Group":"db.sql"}'     \
    --filter '{"db.wait_event.type": "CPU"}'
{
"Total": 0.003355704697986577,
"Dimensions": {
"db.sql.db_id": "pi-4101593903",
"db.sql.id": "209554B4D97DBF72871AE0854DAD97385D553BAA",
"db.sql.tokenized_id": "1F61DDE1D315BB8F4BF198DB219D4180BC1CFE05",
"db.sql.statement": "WITH cte AS (\n SELECT id \n FROM authors \n LIMIT 1 \n )\nUPDATE authors s\nSET email = 'toto' \nFROM cte\nWHERE s.id = cte.id\n\n"
}

load over last 5 minutes based on CPU

aws \
 pi get-resource-metrics \
 --region us-east-1 \
 --service-type RDS \
 --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM \
 --start-time `expr \`date +%s\` - 300 ` \
 --metric-queries '{
      "Metric": "db.load.avg",
      "Filter":{"db.wait_event.type": "CPU"}
      } ' \
 --end-time `date +%s` \
 --period-in-seconds 300

 

Top SQL over last 5 minutes based on CPU

 aws pi describe-dimension-keys     \
    --region us-east-1     \
    --service-type RDS     \
    --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM     \
    --start-time `expr \`date +%s\` - 300 `     \
    --end-time `date +%s`     \
    --metric db.load.avg     \
    --group-by '{"Group":"db.sql"}'     \
    --filter '{"db.wait_event.type": "CPU"}'

alternatively with a partition by waits

aws pi describe-dimension-keys \
 --region us-east-1 \
 --service-type RDS \
 --identifier db-YTDU5J5V66X7CXSCVDFD2V3SZM \
 --start-time `expr \`date +%s\` - 300 ` \
 --end-time `date +%s` \
 --metric db.load.avg \
 --group-by '{"Group":"db.sql"}' \
 --partition-by '{"Group": "db.wait_event"}' \
 --filter '{"db.wait_event.type": "CPU"}'

CLI  with counter metrics

aws \
 pi get-resource-metrics \
 --region us-east-1 \
 --service-type RDS \
 --identifier db-VMM7GRZMTGWZNPWAJOLWTHQDDE \
 --metric-queries "{\"Metric\": \"db.Transactions.xact_commit.avg\"}" \
 --start-time `expr \`date +%s\` - 3600 ` \
 --end-time `date +%s` \
 --period-in-seconds 60 \
 --endpoint-url https://api.integ.pi.a2z.com

Uncategorized

Aurora MySQL synch/mutex/innodb/aurora_lock_thread_slot_futex wait

September 18th, 2018

Thanks to Jeremiah Wilton for the following info:

This wait event indicates that there is a thread which is waiting on an InnoDB record lock. Check your database for conflicting workloads. More information on InnoDB locking can be found here: https://dev.mysql.com/doc/refman/5.6/en/innodb-locking.html

 

In other words, record-level lock conflicts are happening. More than one connection is trying to update the last_login for a particular id in the_table at the same time. Those connections are conflicting and serializing on the record lock for that id. Here’s a query that can help you identify the blocker and waiter for InnoDB record locks in MySQL-family engines. Run this when you see the aurora_lock_thread_slot_futex wait event in Performance Insights. In a future release of Performance Insights, we will automatically generate and display a similar blockers-and-waiters report when Performance Insights detects this event.

select p1.id waiting_thread, p1.user waiting_user, p1.host waiting_host, it1.trx_query waiting_query,
       ilw.requesting_trx_id waiting_transaction, ilw.blocking_lock_id blocking_lock, il.lock_mode blocking_mode,
       il.lock_type blocking_type, ilw.blocking_trx_id blocking_transaction,
       case it.trx_state when 'LOCK WAIT' then it.trx_state else p.state end blocker_state, il.lock_table locked_table,
       it.trx_mysql_thread_id blocker_thread, p.user blocker_user, p.host blocker_host
from information_schema.innodb_lock_waits ilw
join information_schema.innodb_locks il on ilw.blocking_lock_id = il.lock_id and ilw.blocking_trx_id = il.lock_trx_id
join information_schema.innodb_trx it on ilw.blocking_trx_id = it.trx_id
join information_schema.processlist p on it.trx_mysql_thread_id = p.id
join information_schema.innodb_trx it1 on ilw.requesting_trx_id = it1.trx_id
join information_schema.processlist p1 on it1.trx_mysql_thread_id = p1.id;

+----------------+--------------+---------------------+---------------------------------------+---------------------+--------------------+---------------+---------------+----------------------+---------------+----------------------+----------------+--------------+---------------------+
| waiting_thread | waiting_user | waiting_host        | waiting_query                         | waiting_transaction | blocking_lock      | blocking_mode | blocking_type | blocking_transaction | blocker_state | locked_table         | blocker_thread | blocker_user | blocker_host        |
+----------------+--------------+---------------------+---------------------------------------+---------------------+--------------------+---------------+---------------+----------------------+---------------+----------------------+----------------+--------------+---------------------+
|           1117 | reinvent     | 172.31.51.118:34734 | UPDATE sbtest8 SET k=k+1 WHERE id=125 | 888017450           | 888017113:88:6:17  | X             | RECORD        | 888017113            | LOCK WAIT     | `sysbench`.`sbtest8` |           1196 | reinvent     | 172.31.51.118:34888 |
|           1117 | reinvent     | 172.31.51.118:34734 | UPDATE sbtest8 SET k=k+1 WHERE id=125 | 888017450           | 888017089:88:6:17  | X             | RECORD        | 888017089            | LOCK WAIT     | `sysbench`.`sbtest8` |           1431 | reinvent     | 172.31.51.118:35366 |
|           1117 | reinvent     | 172.31.51.118:34734 | UPDATE sbtest8 SET k=k+1 WHERE id=125 | 888017450           | 888015342:88:6:17  | X             | RECORD        | 888015342            | LOCK WAIT     | `sysbench`.`sbtest8` |           1680 | reinvent     | 172.31.51.118:35868 |
.
.
+----------------+--------------+---------------------+----------------------------------------+---------------------+-

Also the following:
https://dev.mysql.com/doc/refman/5.6/en/innodb-information-schema-examples.html

SELECT
  r.trx_id waiting_trx_id,
  r.trx_mysql_thread_id waiting_thread,
  r.trx_query waiting_query,
  b.trx_id blocking_trx_id,
  b.trx_mysql_thread_id blocking_thread,
  b.trx_query blocking_query
FROM       information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b
  ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r
  ON r.trx_id = w.requesting_trx_id;
+----------------+----------------+----------------------------------------+-----------------+-----------------+----------------------------------------+
| waiting_trx_id | waiting_thread | waiting_query                          | blocking_trx_id | blocking_thread | blocking_query                         |
+----------------+----------------+----------------------------------------+-----------------+-----------------+----------------------------------------+
| 917169041      |           2822 | UPDATE sbtest5 SET k=k+1 WHERE id=126  | 917169007       |            2296 | UPDATE sbtest5 SET k=k+1 WHERE id=126  |
| 917169041      |           2822 | UPDATE sbtest5 SET k=k+1 WHERE id=126  | 917168488       |            2214 | UPDATE sbtest5 SET k=k+1 WHERE id=126  |
| 917169025      |           3069 | UPDATE sbtest2 SET k=k+1 WHERE id=125  | 917168945       |            2700 | UPDATE sbtest2 SET k=k+1 WHERE id=125  |
.
.
+----------------+----------------+----------------------------------------+-----------------+-----------------+----------------------------------------+

see AWS forum post at https://forums.aws.amazon.com/thread.jspa?threadID=289484

Uncategorized

Is NFS on ZFS slowing you down?

January 25th, 2018

If you think so, check out shell script “ioh.sh” from github at  https://github.com/khailey/ioh

Introduction and Goals

The goal of ioh.sh is to measure both the throughput and latency of the different code layers when using NFS mounts on a ZFS appliance. The ZFS appliance code layers inspected with the script are I/O from the disks, ZFS layer and the NFS layer. For each of these layers the script measures the throughput, latency and average I/O size. Some of the layers are further broken down into other layers. For example NFS writes are broken down into data sync, file sync and non-sync operations and NFS reads are broken down into cached data reads and reads that have to go to disk.

The primary three questions ioh is used to answer are

  • Is I/O latency from the I/O subsystem to ZFS appliance sufficiently fast?
  • Is NFS latency from ZFS appliance to the consumer sufficiently fast?
  • Is ZFS adding unusual latency

One: If the latency from the I/O subsystem is not adequate then look into supplying better performing I/O subsystem for ZFS appliance. For example if the goal is 3ms write times per 1K redo write but the underlying I/O subsystem is taking 6ms, then it will be impossible for ZFS appliance to meet those expectations.

Two: If the latency for NFS response from ZFS appliance is adequate and yet the NFS client reports latencies as much slower (more than 2ms slower) then one should look instead at problems in the NIC, network or NFS client host, see network tracing, example http://datavirtualizer.com/tcp-trace-analysis-for-nfs/

Three: If the I/O latency is sufficiently fast but ZFS latency is slow, then this could indicate a problem in the ZFS layer.

The answer to the question “what is adequate I/O latency” depends. In general a single random 8 Kb block read on Oracle is expected to take 3-12 ms on average, thus the typical latency is around 7.5 ms. NOTE: when measuring I/O latency on the source system it’s important to use a tool like “iostat” that will show the actually I/Os to the subsystem. The I/O measured by the Oracle database will include both I/Os satisfied from the host file system cache as well as the I/O subsystem unless the database is running with direct I/O

The ioh tool can also give insight into other useful information such as

  • Are IOPs getting near the supported IOPs of the underlying I/O subsystem
  • is NFS throughput getting near the maximum bandwidth of the NIC?”

For example if the NIC is 1GbE then the maximum bandwidth is about 115MB/s, and generally 100MB/s is a good rule of thumb for the max. If throughput is consistently near the NIC maximum, then demand is probably going above maximum and thus increasing latency

$ ioh.sh -h

usage: ./ioh.sh options

collects I/O related dtrace information into file "ioh.out"
and displays the

OPTIONS:
  -h              Show this message
  -t  seconds     runtime in seconds, defaults to forever
  -c  seconds     cycle time ie time between collections, defaults to 1 second
  -f  filename    change the output file name [defautls to ioh.out]
  -p              parse the data from output file only,  don't run collection
  -d  display     optional extra data to show: [hist|histsz|histszw|topfile|all]
                    hist    - latency histogram for NFS,ZFS and IO both reads and writes
                    histsz  - latency histogram by size for NFS reads
                    histszw - latency histogram by size for NFS writes
                    topfile - top files accessed by NFS
                    all     - all the above options
                  example
                    ioh.sh -d "histsz topfile"

two optional environment variables CWIDTH – histogram column width PAD – character between columns in the histograms, null by default

Running

$ sudo ./ioh.sh 

Outputs to the screen and put raw output into default file name “ioh.out.[date]”. The default output file name can be changed with “-o filename” option. the raw output can later be formatted with

$ ./ioh.sh -p  ioh.out.2012_10_30_10:49:27

By default it will look for “ioh.out”. If the raw data is in a different file name it can be specified with “-o filename”

The output looks like

date: 1335282287 , 24/3/2012 15:44:47
TCP out:  8.107 MB/s, in:  5.239 MB/s, retrans:        MB/s  ip discards:
----------------
            |       MB/s|    avg_ms| avg_sz_kb|     count
------------|-----------|----------|----------|--------------------
R |      io:|     0.005 |    24.01 |    4.899 |        1
R |     zfs:|     7.916 |     0.05 |    7.947 |     1020
C |   nfs_c:|           |          |          |        .
R |     nfs:|     7.916 |     0.09 |    8.017 |     1011
- 
W |      io:|     9.921 |    11.26 |   32.562 |      312
W | zfssync:|     5.246 |    19.81 |   11.405 |      471
W |     zfs:|     0.001 |     0.05 |    0.199 |        3
W |     nfs:|           |          |          |        .
W |nfssyncD:|     5.215 |    19.94 |   11.410 |      468
W |nfssyncF:|     0.031 |    11.48 |   16.000 |        2

The sections are broken down into

  • Header with date and TCP throughput
  • Reads
  • Writes

Reads and Writes are are further broken down into

  • io
  • zfs
  • nfs

For writes, the non stable storage writes are separated from the writes to stable storage which are marked as “sync” writes. For NFS the sync writes are further broken down into “data” and “file” sync writes.

examples:

The following will refresh the display every 10 seconds and display an extra four sections of data

$ sudo ./ioh.sh -c 10 -d "hist histsz histszw topfile"   

date: 1335282287 , 24/3/2012 15:44:47
TCP out:  8.107 MB/s, in:  5.239 MB/s, retrans:        MB/s  ip discards:
----------------
            |       MB/s|    avg_ms| avg_sz_kb|     count
------------|-----------|----------|----------|--------------------
R |      io:|     0.005 |    24.01 |    4.899 |        1
R |     zfs:|     7.916 |     0.05 |    7.947 |     1020
R |     nfs:|     7.916 |     0.09 |    8.017 |     1011
- 
W |      io:|     9.921 |    11.26 |   32.562 |      312
W | zfssync:|     5.246 |    19.81 |   11.405 |      471
W |     zfs:|     0.001 |     0.05 |    0.199 |        3
W |     nfs:|           |                     |        .
W |nfssyncD:|     5.215 |    19.94 |   11.410 |      468
W |nfssyncF:|     0.031 |    11.48 |   16.000 |        2
---- histograms  -------
    area r_w   32u   64u   .1m   .2m   .5m    1m    2m    4m    8m   16m   33m   65m    .1s   .3s   .5s    1s    2s    2s+
R        io      .     .     .     .     .     .     .     .     .     1     3     1
R       zfs   4743   287    44    16     4     3     .     .     .     1     2     2
R       nfs      .  2913  2028    89    17     3     .     .     .     1     2     2
-
W        io      .     .     .    58   249   236    50    63   161   381   261    84    20     1
W       zfs      3    12     2
W   zfssync      .     .     .     .    26   162   258   129   228   562   636   250    75    29
W       nfs      .     .     .     .    12   164   265   134   222   567   637   250    75    29
--- NFS latency by size ---------
    ms   size_kb
R   0.1    8     .  2909  2023    87    17     3     .     .     .     1     2     2
R   0.1   16     .     4     5     2
-
W   5.0    4     .     .     .     .     8    49    10     3     4    11     4     2     1
W  21.4    8     .     .     .     .     4    55   196    99   152   414   490   199    60    23
W  18.3   16     .     .     .     .     .    34    29    25    43    91    84    28     8     5
W  16.1   32     .     .     .     .     .    19    16     7    14    38    36    15     3
W  19.3   64     .     .     .     .     .     6    11     .     9    11    19     6     2     1
W  20.4  128     .     .     .     .     .     1     3     .     .     2     4     .     1
---- top files ----
   MB/s                  IP  PORT filename
W  0.01MB/s  172.16.103.196 52482 /domain0/group0/vdb17/datafile/home/oracle/oradata/swingb/control01.ora
W  0.02MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/control01.ora
W  0.57MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/undo.dbf
W  0.70MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/redo3.log
W  3.93MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/opt/app/10.2.0.4/db_1/dbs/soe.dbf
-
R  0.01MB/s  172.16.100.102 39938 /domain0/group0/vdb12/datafile/home/oracle/oradata/kyle/control01.ora
R  0.01MB/s  172.16.103.196 52482 /domain0/group0/vdb17/datafile/home/oracle/oradata/swingb/control01.ora
R  0.02MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/control01.ora
R  0.05MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/undo.dbf
R  7.84MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/opt/app/10.2.0.4/db_1/dbs/soe.dbf
IOPs         313

Sections

First line is the date Second line is TCP MB per second in,out and retransmitted. The last value is “ip discards”

The three parts are all related and are a drill down starting with course grain data at the top to finer grain data at the bottom.

  • averages – default
  • histograms [hist]
  • histograms by size reads [histsz] writes [histszw] for NFS

The first section is a quick overview.

The second section breaks out the latency into a histogram so one can get an indication of amount of I/O from memory (ie those in microsecond ranges) as well as how far out the outliers are. (are the outliers on the VDBs matching up to the outliers seen on ZFS appliance?)

The third section differentiates between latency of single block random (typically the 8K size) and latency of multi-block sequential reads (32K and higher). The differentiation is important when comparing to Oracle stats which are grouped by single block random reads (called “db file sequential read” ) and sequential multi-block read (called “db file scattered read”).

The final section

top files read and write [topfile]

is sort of a sanity check as there are periods where there is suppose to be little to no NFS I/O and yet there is, so the top file sections tells which file and which host the NFS I/O is coming from.

The last line after all the sections is total IOPs for reads plus writes. (Note these IOPs could get converted to higher values at the storage layer if using RAID5 which will cause each write to be two reads plus two writes.)

The first section, shows up by default. The other sections require command line arguments.

To see just the first section, which is the default, run ioh.sh without any arguments:

sudo ./ioh.sh

To show non-default sections, add them to the command line

sudo ./ioh.sh -d "hist histsz histszw topfile"

A shortcut for all sections is “all”

sudo ./ioh.sh  -d all

Collecting in the background

nohup sudo ./ioh.sh -c 60 -t 86400 &

Runs the collection for 1 day (86400 seconds) collecting every 60 seconds and put raw output into default file name “ioh.out”. The default output file name can be changed with “-o filename” option.

1. Averages:

The displays I/O, ZFS and NFS data for both reads and writes. The data is grouped to try and help easily correlate these different layers First line is date in epoch format

columns

  • MB/s – MB transferred a second
  • avg_ms – average operation time
  • avg_sz_kb – average operation size in kb
  • count – number of operations

example

.
             |      MB/s|     mx_ms| avg_sz_kb|     count
 ------------|----------|----------|----------|--------------------
 R |      io:|  through | average  |  average |      count of operations
 R |     zfs:|  put     |latency   | I/O size | 
 R |     nfs:|          |millisec  | in KB    |   
 - 
 W |      io:|          |          |          |
 W | zfssync:|          |          |          |                                         
 W |     zfs:|          |          |          |                                         
 W |     nfs:|          |          |          |                                         
 W |nfssyncD:|          |          |          |                                         
 W |nfssyncF:|          |          |          |                                         

For writes

  • zfssync – these are synchronous writes. THese should mainly be Oracle redo writes.
  • nfs – unstable storage writes
  • nfssyncD – data sync writes
  • nfssyncF – file sync writes

DTrace probes used

  • io:::start/done check for read or write
  • nfs:::op-read-start/op-read-done , nfs:::op-write-start/op-write-done
  • zfs_read:entry/return, zfs_write:entry/return

2. Histograms

latency distribution for i/o, zfs, nfs for reads and writes. These distributions are not normalized by time, ie if ioh.d is outputs once a second then these counts will be equal to the counts in the first section. If ioh.d outputs every 10 seconds, then these values will be 10x higher

  1. Histograms by size for reads and writes

The first column is the average latency for the size of I/O for this line. The second column is the size. The size includes this size and every size lower up till the previous bucket. The goal here is to show the sizes of I/Os and the different latency for different sizes. For an Oralce database with 8k block size, 8k reads will tend to be random where as higher read sizes say will be multiblock requests and represent sequential reads. It’s common to see the 8K reads running slower than the larger reads.

4. Top files

shows the top 5 files for reads and writes. First column is MB/s, then R or W, then IP, then port then filename

Examples and Usage

Idle system

First thing to look at is the MB in and out which answers

  • “how busy is the system?”
  • “is NFS throughput approaching the limits of the NIC?”

In the following example, there is only less than 50KB/s total NFS throughput ( in plus out) thus the system isn’t doing much, and there must be no database activity other than the regular maintenance processes which are always running on a database. To confirm this, one can look at the top files at the bottom and see that the only activity is on the control files which are read and written to as part of database system maintenance. Otherwise there is no activity to speak of, so no reason look at I/O latency in this case. Additionally, all majority of what little I/O is in 16K sizes which is typical of control file activity, where as the default database data block activity is in 8K sizes. Most read I/O is coming from ZFS appliance cache as its 64 micro seconds.

date: 1335282646 , 24/3/2012 15:50:46
TCP  out:  0.016 MB/s, in:  0.030 MB/s, retrans:        MB/s  ip discards:
----------------
            |       MB/s|    avg_ms| avg_sz_kb|     count
------------|-----------|----------|----------|--------------------
R |      io:|           |          |          |        .
R |     zfs:|     0.016 |     0.01 |    1.298 |       13
R |     nfs:|     0.016 |     0.10 |   16.000 |        1
- 
W |      io:|     0.365 |     4.59 |    9.590 |       39
W | zfssync:|     0.031 |    14.49 |   16.000 |        2
W |     zfs:|     0.001 |     0.07 |    0.199 |        3
W |     nfs:|           |          |          |        .
W |nfssyncD:|     0.003 |          |          |        .
W |nfssyncF:|     0.028 |    14.33 |   14.400 |        2
---- histograms  -------
    area r_w   32u   64u   .1m   .2m   .5m    1m    2m    4m    8m   16m   33m   65m    .1s   .3s   .5s    .5s+
R        io      .
R       zfs     60     5
R       nfs      .     .     5
-
W        io      .     .     .    20    43    60    11    11     8    28    17     1
W       zfs      2     8     5     2
W   zfssync      .     .     .     .     .     .     2     .     2     5     1     1
W       nfs      .     .     .     .     .     .     2     .     2     5     1     1
--- NFS latency by size ---------
    ms   size_kb
R   0.1   16     .     .     5
-
W          8     .     .     .     .     .     .     .     .     .     1     .     1
W  16.0   16     .     .     .     .     .     .     2     .     2     4     1
---- top files ----
   MB/s                  IP  PORT filename
W  0.00MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/control01.ora
W  0.01MB/s  172.16.100.102 39938 /domain0/group0/vdb12/datafile/home/oracle/oradata/kyle/control01.ora
W  0.01MB/s  172.16.103.133 59394 /domain0/group0/vdb13/datafile/home/oracle/oradata/kyle/control01.ora
W  0.01MB/s   172.16.100.69 39682 /domain0/group0/vdb14/datafile/home/oracle/oradata/kyle/control01.ora
W  0.01MB/s  172.16.103.196 52482 /domain0/group0/vdb17/datafile/home/oracle/oradata/swingb/control01.ora
-
R  0.00MB/s   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/control01.ora
R  0.01MB/s  172.16.100.102 39938 /domain0/group0/vdb12/datafile/home/oracle/oradata/kyle/control01.ora
R  0.01MB/s  172.16.103.196 52482 /domain0/group0/vdb17/datafile/home/oracle/oradata/swingb/control01.ora
IOPs          39

Active System Below is an example of an active system. Looking at TCP bytes in and out, there is a fair bit 3MB/s out and 2MB/s in. These rates are a long way from saturating 1GbE but there is activity going on.

READs all reads are coming out of the cache. How do we know? For one the average ms latency is 0.07, or 70 micro seconds. Does thhis 70us include slower reads that might be off disk? Looking at the histogram, one can see that the slowest zfs I/O is under 100us and looking just above at the I/O histogram there are no I/Os thus all the I/O is coming from cache.

Writes Writes are pretty slow. Oracle Redo writes on good systems are typically 3ms or liess for small redo. Unfortunately most of the I/O is coming from datafile writes so it’s difficult to tell what the redo write times are. (maybe worth enhancing ioh.d to show average latency by file) Typically the redo does “nfssyncD” writes and datafile writes are simply unstable storage writes “nfs” writes that get sync at a later date. This particular database is using the Oracle parameter “filesystemio_options=setall” which implements direct I/O. Direct I/O can work without sync writes but the implementation depends on the OS. This O/S implementation, OpenSolaris, causes all Direct I/O writes to by sync writes.

date: 1335284387 , 24/3/2012 16:19:47
TCP out:  3.469 MB/s, in:  2.185 MB/s, retrans:        MB/s  ip discards:
----------------
            ||         |           |          |          o       MB/s|    avg_ms| avg_sz_kb|     count
------------|-----------|----------|----------|--------------------
R |      io:|           |          |          |        .
R |     zfs:|     3.387 |     0.03 |    7.793 |      445
R |     nfs:|     3.384 |     0.07 |    8.022 |      432
- 
W |      io:|     4.821 |    12.08 |   24.198 |      204
W | zfssync:|     1.935 |    38.50 |   11.385 |      174
W |     zfs:|     0.001 |     0.06 |    0.199 |        3
W |     nfs:|           |          |          |        .
W |nfssyncD:|     1.906 |    39.06 |   11.416 |      171
W |nfssyncF:|     0.028 |    14.59 |   14.400 |        2
---- histograms  -------
    area r_w   64u   .1m   .2m   .5m    1m    2m    4m    8m   16m   33m   65m    .1s   .3s   .3s+
R        io      .
R       zfs   2185    34     5     .     1
R       nfs    903  1201    47     8     1
-
W        io      .     .    19   142   143    46    42   108   240   212    57    12     1
W       zfs     13     3     1
W   zfssync      .     .     .     .    10     6     .    21    60   384   287    86    16
W       nfs      .     .     .     .    10     5     .    21    60   384   287    86    16
--- NFS latency by size ---------
    ms   size_kb
R   0.1    8   900  1199    47     7     1
R   0.1   16     3     2     .     1
-
W  17.7    4     .     .     .     .     3     1     .     2     5     3     3
W  41.1    8     .     .     .     .     3     .     .    13    35   292   231    76    13
W  34.0   16     .     .     .     .     3     3     .     4    13    61    30     8     2
W  39.0   32     .     .     .     .     .     1     .     .     2    16    14     2     1
W  28.3   64     .     .     .     .     1     .     .     .     2     9     8
W  26.2  128     .     .     .     .     .     .     .     2     3     2     1
W        256     .     .     .     .     .     .     .     .     .     1
---- top files ----
   MB/s             IP  PORT filename
R  0.01   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/control01.ora
R  0.01  172.16.103.196 52482 /domain0/group0/vdb17/datafile/home/oracle/oradata/swingb/control01.ora
R  0.02   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/system.dbf
R  0.02   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/undo.dbf
R  3.33   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/opt/app/product/dbs/soe.dbf 
-
W  0.01   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/control01.ora
W  0.01  172.16.103.196 52482 /domain0/group0/vdb17/datafile/home/oracle/oradata/swingb/control01.ora
W  0.15   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/undo.dbf
W  0.30   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/oradata/sol/redo1.log
W  1.46   172.16.100.81 21763 /domain0/group0/vdb16/datafile/export/home/opt/app/product/dbs/soe.dbf 
IOPs         204

ZFS read layer problem

          |      MB/s|    avg_ms|  avg_sz_kb
----------|----------|----------|-----------
R |   io :|    88.480|      4.60|     17.648 
R |  zfs :|    19.740|      8.51|     12.689 
R |  nfs :|    16.562|     22.67|     30.394 

In this case the ZFS I/O 19MB/s is higher than NFS at 16MB/s. Now that could because some thing is accessing the file system locally on ZFS appliance or that ZFS is doing read ahead, so there are possible explanations, but it’s interesting. Second subsystem I/O at 88MB/s is much greater than ZFS I/O at 19MB/s. Again that is notable. Could because there is a scrub going on. (to check for a scrub, run “spool status”, to turn off scrub run “zpool scrub -s domain0″ though the scrub has to be run at some point). Both interesting observations.

Now the more interesting parts. The NFS response time 22ms is almost 3x the average ZFS response time 8ms. On the other hand the average size of NFS I/O is 2.5x the average ZFS I/O size so that might be understandable. The hard part to understand is that the ZFS latency 8ms is twice the latency of subsystem I/O at 4ms yet the average size of the I/O sub-system reads is bigger than the average ZFS read. This doesn’t make any sense.

In this case to hone in the data a bit, it would be worth turning off a scrub if it was running and see what the stats are to eliminate a factor that could be muddying the waters.

But in this case, even without a scrub going, the ZFS latency was 2-3x slower than the I/O subsystem latency.

It turns out ZFS wasn’t caching and spending a lot of time trying to keep the ARC clean.

ZFS write layer problem

           |       MB/s|    avg_ms| avg_sz_kb|     count
-----------|-----------|----------|----------|--------------------
W |     io:|     10.921|     23.26|    32.562|       380
W |    zfs:|    127.001|     37.95|     0.199|      8141
W |    nfs:|     0.000 |     0.00 |    0.000 |        0

NFS is 0 MB/s because this was from http traffic. The current version of ioh would show the TCP MB/s. This version also mixed up zfs sync and non-sync writes into one bucket, but much of the ZFS writes have to be non-sync because the write rate is 127MB/s where as the I/O subsystem writes are only 10MB/s thus at least 117MB/s is not sync and if they are not sync they are just memory writes so should be blindingly fast, but they aren’t. The average latency for the ZFS writes is 37ms. All the more shockingly the average size is only 0.199K where as the I/O subsystem writes 32K in 23ms. The case here was that because of disk errors, the ZFS layer was self throttling way to much. This was a bug

Uncategorized

IP CIDR rules and address ranges

November 3rd, 2017

I always forget IP address range coverage rules and forget where to look.

It’s the wiki!

https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing

and for good reference here is the table:

/32 is for a single address

/24 is for a range in the last place x.x.x.0

/16 is for a range in the last 2 places x.x.0.0

Screen Shot 2017-11-03 at 11.28.31 AM

Uncategorized

Best method for tuning sub-optimal execution plans

October 18th, 2017

How do you determine if the Oracle SQL optimizer has created a sub-optimal execution plan? re-run statistics and see what happens? wait for Oracle to find other execution plans? What if neither method is helping? Do you read the execution plan? What do you look at? Differences in actual vs estimated? How successful is that? Look for full table scans?  Do you look at the 10053 trace? How much time and effort does that take?  What do you look at in the 10053 trace. Do you have a systematic methodology  that works in almost all cases?

Well there is a method that is reliable and systematic. It’s laid out in Dan Tow’s book SQL Tuning.

The method is tedious as it requires a lot of manual work to draw join trees, identify constraints and relationships, manual decomposition and execution of every 2 table join in the statement. It can add up to a lot of work but it is systematic and dependable.

Cool thing is it can all be done automatically with a tool called DB Optimizer that now (as I look today) only cost about $400.

If you cost your company $50/hour then in 8 hours of saved work it’s paid for its self. In my experience as a DBA I have serveral SQL a year that take me over a day to optimize manually but that I can get done in a few minutes with DB Optimizer.  Thus with just one hard SQL the tool has paid for itself. The DB Optimizer analysis might run for a couple hours, but afterwords with the data it collects and presents, I can find better tuning path in minutes if it exists.

 

Here is previous blog post that gives some an overview

Here is a video that explains the method. (Same presentation in a different video.)

Slides from the videos.

Here is Dan Tow’s book SQL Tuning that originally laid out the method.

Here is post by Jonathan Lewis demonstrating the method.

Pick up a copy. I think it’s super cool and interested in feedback on the your experiences.

 

sql9

Uncategorized

Oaktable World 2017 @ Oracle Open World

September 29th, 2017

 

http://www.oaktable.net/blog/oak-table-world-2017-oracle-open-world

The Oak Table members will be discussing their latest technical obsessions and research on Monday and Tuesday, (Oct. 2nd and 3rd, 2017).  The truth is, folks-  The Oak Table experts are an AWESOME group, (if I don’t say so myself! :)) as we could have easily done another day of incredible sessions, but alas, two days is all we have available for this year’s event.

 

Screen Shot 2017-09-29 at 11.10.53 AM

Screen Shot 2017-09-29 at 11.11.04 AM

Uncategorized

benchmarking your disks

March 17th, 2017

 

While at Delphix, we did a lot of storage benchmarking. The I/O response times of Delphix depends, as one would logically imagine, heavily on the underlying disks. Sure Delphix can cache a lot ( with 1 TB of ram and 3x compression that’s 3TB and that 3TB can be shared by 10 or a 100 copies being the equivalent to 30TB or 300TB of databases) but really there will always be important I/O coming from the storage subsystem.

Now Delphix mainly runs databases loads, so the best test for storage that is hooked up to Delphix is to benchmark the storage I/O for a database workload. Two questions arise

  • What tool can benchmark the I/O?
  • What is a typical database I/O workload?

For the tool, the clear answer seems to be fio which not only is quite flexible but has an active community and updates and was started by the renown Linux I/O master Jens Axboe and still is actively maintained by him.

Now for database workloads there are 4 types of I/O

  1. Random single block reads (typically lookups by index)
  2. Large multi block reads (typically full table scans)
  3. Small sequential I/Os to transaction log files
  4. Large sequential  I/Os totransaction log files (generally when commits are infrequent and change rate is high)

Now in a database, all of these can be done concurrently by multiple processes concurrently and thus we need to benchmark for concurrency as well.

Thus the matrix is the 4 types of I/Os to be tested by different number of concurrent users.

Now fio doesn’t come with such prepackaged I/O benchmarks, so I created a script fio.sh to run configurable database I/O benchmarks.

The code and examples are on github at

Prerequisite is to have a binary of the fio command ready.

Download the source at

and compile it.

Here is an example running the script (-b  followed by full path to fio binary , -w followed by the directory to create a temporary large file for I/O testing)

  $  fio.sh -b `pwd`/fio.opensolaris -w /domain0/fiotest   

configuration: 
    binary=/home/oracle/fiodir/fio.opensolaris
    work directory=/domain0/fiotest   
    output directory=/home/oracle/fiodir
    tests=readrand read write
    direct=1
    seconds=60
    megabytes=65536
    custom users=-1
    custom blocksize=-1
    recordsize =8k
    filename (blank if multiple files)="filename=fiodata"
    size per file of multiple files=""
proceed?
y
CREATE 65536 MB file /home/oracle/fiodir/workdir/fiodata

creating 10 MB  seed file of random data

20480+0 records in
20480+0 records out
10485760 bytes (10 MB) copied, 1.17997 seconds, 8.9 MB/s

creating 65536 MB of random data on 

............................................................. 64926 MB remaining  300 MB/s 216 seconds left
............................................................. 64316 MB remaining  600 MB/s 107 seconds left
............................................................. 63706 MB remaining  300 MB/s 212 seconds left
............................................................. 63096 MB remaining  300 MB/s 210 seconds left
............................................................. 62486 MB remaining  200 MB/s 312 seconds left
............................................................. 61876 MB remaining  100 MB/s 618 seconds left
............................................................. 61266 MB remaining  35 MB/s 1750 seconds left
............................................................. 60656 MB remaining  300 MB/s 202 seconds left
............................................................. 60046 MB remaining  150 MB/s 400 seconds left
............................................................. 59436 MB remaining  75 MB/s 792 seconds left
............................................................. 58826 MB remaining  75 MB/s 784 seconds left
............................................................. 58216 MB remaining  85 MB/s 684 seconds left
............................................................. 57606 MB remaining  75 MB/s 768 seconds left
............................................................. 56996 MB remaining  75 MB/s 759 seconds left
............................................................. 56386 MB remaining  85 MB/s 663 seconds left

(more output)

test  users size         MB       ms  IOPS    50us   1ms   4ms  10ms  20ms  50ms   .1s    1s    2s   2s+
    read  1   8K r   28.299    0.271  3622           99     0     0     0
    read  1  32K r   56.731    0.546  1815           97     1     1     0     0           0
    read  1 128K r   78.634    1.585   629           26    68     3     1     0           0
    read  1   1M r   91.763   10.890    91                 14    61    14     8     0     0
    read  8   1M r   50.784  156.160    50                              3    25    31    38     2
    read 16   1M r   52.895  296.290    52                              2    24    23    38    11
    read 32   1M r   55.120  551.610    55                              0    13    20    34    30
    read 64   1M r   58.072 1051.970    58                                    3     6    23    66     0
randread  1   8K r    0.176   44.370    22      0     1     5     2    15    42    20    10
randread  8   8K r    2.763   22.558   353            0     2    27    30    30     6     1
randread 16   8K r    3.284   37.708   420            0     2    23    28    27    11     6
randread 32   8K r    3.393   73.070   434                  1    20    24    25    12    15
randread 64   8K r    3.734  131.950   478                  1    17    16    18    11    33
   write  1   1K w    2.588    0.373  2650           98     1     0     0     0
   write  1   8K w   26.713    0.289  3419           99     0     0     0     0
   write  1 128K w   11.952   10.451    95           52    12    16     7    10     0     0           0
   write  4   1K w    6.684    0.581  6844           90     9     0     0     0     0
   write  4   8K w   15.513    2.003  1985           68    18    10     1     0     0     0
   write  4 128K w   34.005   14.647   272            0    34    13    25    22     3     0
   write 16   1K w    7.939    1.711  8130           45    52     0     0     0     0     0     0
   write 16   8K w   10.235   12.177  1310            5    42    27    15     5     2     0     0
   write 16 128K w   13.212  150.080   105                  0     0     3    10    55    26     0     2

What we see is

  • test – the test we are running either randread, write or read
  • users – number of concurrent users
  • size – size of I/O requests. Databases typically request 8kb  at a time
  • MB – MB per second throughput
  • ms – average latency
  • min – min latency(not shown here)
  • max – max latency (not shown here)
  • std – standard deviation on latency (not shown here)
  • IOPS – I/O operations per second
  • 50us 1ms 4ms 10ms 20ms 50ms .1s 1s 2s 2s+ – histogram of number of I/Os faster than heading value

 

This can be useful to just run on your laptop.

This summer I bought a used Mac laptop that had something called Hybrid SSD. I had been using a Mac with preinstalled SSD disks and thought the Hybrid would be similar response wise, but once I started using it, there was something cleary wrong, but before sending it back I wanted some empirical proof, so I ran fio.sh.

Here is the comparison

SSD - came with the Mac

test  	users size      MB       ms      min      max      std    IOPS 
randread    1   8K  32.684    0.234    0.002    9.393    0.144   4183,
randread    8   8K 240.703    0.257    0.001    2.516    0.137  30810,
randread   16   8K 372.503    0.333    0.001    1.994    0.185  47680,
randread   32   8K 478.863    0.520    0.001    5.281    0.294  61294,
randread   64   8K 476.948    1.045    0.001   11.564    0.582  61049,

SSHD - hybrid SSD installed after market

test  	users size      MB       ms      min      max      std    IOPS 
randread    1   8K   0.533   14.608    0.005  138.783    8.989     68,
randread    8   8K   0.767   80.769    0.035  256.965   53.891     98,
randread   16   8K   0.801  152.982    0.012  331.538   63.256    102,
randread   32   8K   0.810  298.122    0.015  519.073   79.781    103,
randread   64   8K   0.796  590.696    0.030  808.146  143.490    101,

(full list of SSD vs HSSD on my Macs at https://github.com/khailey/fio_scripts/blob/master/macssd)

The hybrid is atrocious compared to the SSD.

The random read is 14.6 ms which is the speed of a slow HDD.
A 7K RPM HDD should respond at under 10ms.
A 15K RPM HDD should respond at around 6ms.
An SSD on a 2 year old Mac responds at 0.23 ms.

Its nice to just have an easy to run script to test out storage.
Here is my linux box

test  users size         MB       ms      min      max      std    IOPS 
                                            
randread  1   8K r   14.417    0.517    0.005    8.922    0.382    1845
randread  8   8K r   26.497    2.355    0.004   12.668    0.790    3391
randread 16   8K r   24.631    5.069    0.004   15.168    1.080    3152
randread 32   8K r   24.726   10.101    0.005   32.042    2.124    3164
randread 64   8K r   24.899   20.051    0.005   37.782    4.171    3187

On my Linux desktop you can see how the MB/sec throughput maxes out about 26 MB/sec and after that latency just goes down proportionally as we add more concurrency.

The github repository all has r scripts to visualize the data (see the readme in github for details on how to generate the graphics)

 

skytap1_randread_bs_8K

Here is an explanation of the graphics.

graph_key

 

There are a number of factors that are important when benchmarking I/O such as whether using Direct I/O or not, what the size of caching is on the host running fio, what the back end storage  cache size is, what the size is of the file used to test I/O, how that file is initialized other with 0’s, or  patterned data, or random data, whether the file system compresses or not, etc. Check out this blog post for some anomalies and surprises: http://datavirtualizer.com/lies-damned-lies-and-io-statistics/

Uncategorized

Postgres monitoring script pgmon.sh

March 9th, 2017

Screen Shot 2017-03-09 at 8.13.35 AM

photo by Thomas Hawk

https://github.com/khailey/pgmon

$ ./pgmon.sh 
Usage: pgmon.sh [username] [password] [host] <sid=postgres> <port=5432> <runtime=3600>

$ ./pgmon.sh kyle kyle  mymachine.com

  psql -t -h mymachine.com -p 5432 -U kyle postgres < /tmp/MONITOR/tmp/mymachine.com:postgres_collect.pipe &

  RUN_TIME=-1
  COLLECT_LIST=
  FAST_SAMPLE=wts
  TARGET=mymachine.com:postgres
  DEBUG=0

  Connected, starting collect at Wed Mar 8 12:05:12 PST 2017
  starting stats collecting
  SET

  AAS| blks_hit | blks_read | tup_returned | tup_fetched | tup_inserted | tup_updated | tup_deleted  

   1 |    38281 |         1 |       628629 |       23600 |         1068 |        2316 |           0 
  16 |   146522 |         0 |      1497604 |       48647 |         2352 |        4599 |        2462 
   2 |   114046 |         0 |      1894329 |       46341 |         3822 |        3852 |        3066 
   2 |   146728 |         0 |      2239014 |       61420 |         3822 |        5668 |        3150 
  16 |    70446 |         0 |       945021 |       49284 |         2016 |         686 |         757 
  13 |   264149 |         0 |      1146816 |       53816 |         1638 |        2176 |        1852 
  15 |    54324 |         0 |       226542 |       19078 |          840 |         396 |          31 
  13 |  1170087 |         0 |      2301442 |      186967 |         2058 |        4276 |        1340 
   3 |  1036439 |         0 |      3411396 |       57392 |         4158 |        5041 |        3605 
   1 |   135927 |         0 |      1931473 |       90238 |         4788 |        5077 |        3654 
   5 |    92975 |         0 |      1427641 |       49175 |         2772 |        2812 |        1764 
  16 |    73695 |         0 |      1001290 |       35585 |         1806 |        1721 |        1915 
  14 |    65117 |         0 |       242383 |       22150 |          420 |         530 |         511 
   4 |   111906 |         0 |      1593090 |       49570 |         2982 |        4718 |        3086 

When there are databases for which I’m responsible, I like to have powerful performance analysis database tools. In the Oracle world I use tools such as Lab128, DB Optimizer and OEM. There a number of other tools that look good from companies such Quest, Solarwinds and Orachrome.  Often though when I’m dealing with a new situation, a new database or a database that’s temporary, then I just want a quick and easy way to see basic stats easily and quickly. For Oracle I have a number of easy monitoring scripts.

For monitoring databases other than Oracle, I have less experience. I did do work on SQL Server, Sybase and DB2 when working on the development of  DB Optimizer but monitoring popular open source databases such as MySQL and Postgres is relatively new to me.

Right now I’m looking into Postgres.

For Postgres I wanted some quick and easy way to monitor in realtime various stats to start iterating and prototyping the type of data I wanted to look at, so I  put together a shell script. The advantage of a shell script is it should work most anywhere and the only requirement is psql which should be readily available in a PosgreSQL environment.

I took a script from my Oracle experiences, oramon.sh, and retrofitted it for Postgres. It is a script that sets up a named pipe  then has psql (or for Oracle , SQL*Plus) read from that named pipe. The script procedurally feeds SQL statements into the named pipe. The output from psql goes to files and the script manages the output to the screen. The script is a little less optimal for PostgreSQL than Oracle as I had to use a temporary table to store variable values. Using psql has variables but they look to be static can not modifiable (seem to act like defines).

The script pulls out some basic stats and outputs them every 5 seconds on the command line, sort of like vmstat. The first stat is AAS or average active sessions which is the count of sessions actively running queries, followed by a few basic database stats.

 

Uncategorized

Example Jmeter workload for Postgres and Oracle

February 27th, 2017

pgload.jmx is JMX file you can load into Jmeter and run a substantial load on a Postgres database. Should work just as well on Oracle if you change the test SQL from “Select 1″ to “select 1 from dual”

Install jmeter on our machine . On my mac, I did

  • brew install jmeter

You will need the Postgres driver. I used

jmeter1

To use this file, save it as pgload.jmx and then open it up with Jmeter. Change the database URL with host, port, database name, and fill in your username and password.

The database URL looks like jdbc:postgresql://machine.com:5432/database

where “machine.com” is the machine running PostgreSQL  and “database” is the name of the database you are using in that instance and “5432” is the port to connect over.

Screen Shot 2017-02-27 at 2.56.19 PM

 

You have to make TWO connection string changes. One is as shown above for “Setup Connection” and the other is 2 rows below “JDBC Connection Configuration” where you make the same changes.

Then just hit the green triangle to start the load.

The load will create a table named authors, a sequence called serial and an index called author_id if these don’t already exist.

It will then run inserts, deletes, updates and selects on this table.

 

Uncategorized

9th Circuit Court Ruling 3-0

February 12th, 2017

Little did I know this building that captured my visual attention and imagination so many times walking to work over the last 6 months would play a historic roll in the current political climate.

Here is a picture of the US District Court House from recent articles

Screen Shot 2017-02-11 at 9.19.43 PM

 

And here are some of my iPhone shots over the last few months with some Instagram filtering mixed in :)

Screen Shot 2017-02-10 at 8.17.09 AM Screen Shot 2017-02-10 at 8.15.09 AM

Screen Shot 2017-02-10 at 8.15.01 AMScreen Shot 2017-02-10 at 8.16.58 AM Screen Shot 2017-02-10 at 8.16.52 AM Screen Shot 2017-02-10 at 8.16.21 AM Screen Shot 2017-02-10 at 8.15.54 AM Screen Shot 2017-02-10 at 8.14.32 AM Screen Shot 2017-02-10 at 8.14.24 AM Screen Shot 2017-02-10 at 8.14.17 AM Screen Shot 2017-02-10 at 8.14.09 AM Screen Shot 2017-02-10 at 8.13.59 AM Screen Shot 2017-02-10 at 8.13.41 AM Screen Shot 2017-02-10 at 8.13.32 AM Screen Shot 2017-02-10 at 8.13.18 AM

Uncategorized