introducing network latency on Docker (SQL Server Always On testing)

April 13th, 2022

I am testing out Always On replication on SQL Server with a cool docker setup that includes primary and secondary https://github.com/rafaelrodrigues3092/docker-mssql-alwayson

When I created data on the primary there was never any lag on the secondary so I wanted to make sure my lag values were correct and wanted to introduce network latency between the primary and secondary to introduce lag time. For this there is a neat tool called “tc” that can introduced lag. For example on either or both of the docker containers , I can set up lag with

# add lag
tc qdisc add dev eth0 root netem delay 100ms
# delete lag
tc qdisc del dev eth0 root netem delay 100ms
#show lag
tc qdisc

Problem is, there was no “tc” on my docker containers and yum couldn’t find it (it is part of iproute2)

# yum install iproute2
Last metadata expiration check: 1:34:54 ago on Wed Apr 13 20:10:03 2022.
No match for argument: iproute2
Error: Unable to find a match: iproute2

I figure there is something wrong with the way the yum repos are setup on the docker images and I poked around there for a while but couldn’t get it to work, so then I tried to install by hand which turned out more onerous than I thought, so here I’m just documenting what was needed

create directory tc_build
cd tc_build
curl -o iproute2.tz  https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/snapshot/iproute2-5.17.0.tar.gz
curl -o bison.gz http://ftp.gnu.org/gnu/bison/bison-3.8.tar.gz
curl -o gettext.gz  https://ftp.gnu.org/pub/gnu/gettext/gettext-0.21.tar.gz
curl -o texinfo-6.8.tar.gz https://ftp.gnu.org/gnu/texinfo/texinfo-6.8.tar.gz
wget https://github.com/westes/flex/releases/download/v2.6.4/flex-2.6.4.tar.gz
yum install make
yum install m4
yum install libtool
yum install diffutils
for i in *z; do echo $i ; tar xvf $i ; done
mkdir trash
mv *z trash
cd textinfo*
 ./autogen.sh
 ./configure 
  make
  make install
cd ../gettext*
  make 
cd ../flex*
  make
cd ../bison*
  make  
cd ../iproute2*
  make  

The above isn’t exact but should get you mostly there. When it comes to “make” in each directory  its some variation of

  • autogen.sh
  • configure
  • make
  • make install

Not every directory needed all 4 commands but unfortunately didn’t take detailed enough notes to say which needed which. The most onerous part was “gettext” which was needed just for “autopoint” but I didn’t find a way just to install “autopoint” and “gettext” is big and takes a while to compile and install. Once it was all done “tc” was there. Then I just tar’ed the “tc” commands in /usr/sbin  in tc.tar.gz which I can copy to a docker container and voila it’s there.

 

Had to modify the docker-compose.yml to include

        cap_add:
            - NET_ADMIN

then after starting the containers , on both the primary and secondary, I set network latency

tc qdisc add dev eth0 root netem delay 500ms

The docker primary comes with SALES database that is replicated to the secondary, do I go on the primary, use SALES and create a bunch of data:

 

use sales;
create table foo(id int, content varchar(2000)) ;   
WITH mycte AS ( 
      SELECT 1 DataValue 
      UNION all 
      SELECT DataValue + 1 
      FROM    mycte    
      WHERE   DataValue + 1 <= 100000
 ) 
 INSERT INTO foo(id,content) 
 SELECT   DataValue,REPLICATE (NEWID() ,55) -- 
 FROM mycte m  
 OPTION (MAXRECURSION 0) 
 ; 

 

There is a nice sql query for monitoring Always On here https://dba.stackexchange.com/questions/278324/log-send-queue-size-and-redo-queue-size-in-ag

Here’s a short version of the query

SELECT 
    --AG info
    AGName                          = ag.name, 
    ReplicaServer                   = ar.replica_server_name, 
    DbName                          = adc.database_name, 
       Failover_ready                                    = ar.failover_mode_desc,
    -- State Info
    IsLocal                         = drs.is_local, 
    IsPrimary                       = drs.is_primary_replica, 
    SynchronizationState            = drs.synchronization_state_desc, 
    SynchronizationHealthState      = drs.synchronization_health_desc, 
    -- Queue Size & rates
    SendQueueSize                   = drs.log_send_queue_size,
    RedoQueueSize                   = drs.redo_queue_size, 
    SendRateKb                      = drs.log_send_rate, 
    RedoRateKb                      = drs.redo_rate, 
     --Oh yeah, filestream, too
    FileStreamSendRate              = drs.filestream_send_rate,
   drs.Secondary_lag_seconds 
FROM sys.dm_hadr_database_replica_states AS drs
JOIN sys.availability_databases_cluster AS adc 
             ON drs.group_id = adc.group_id AND 
                drs.group_database_id = adc.group_database_id
JOIN sys.availability_groups AS ag
             ON ag.group_id = drs.group_id
JOIN sys.availability_replicas AS ar 
             ON drs.group_id = ar.group_id AND 
                 drs.replica_id = ar.replica_id
ORDER BY 
    -- ag.name, ar.replica_server_name, adc.database_name;
    drs.log_send_queue_size + drs.redo_queue_size DESC;

primary

-[ RECORD 1 ]-------------------------
AGName                     | AG1
ReplicaServer              | db2
DbName                     | SALES
Failover_ready             | MANUAL
IsLocal                    | 0
IsPrimary                  | 0
SynchronizationState       | SYNCHRONIZED
SynchronizationHealthState | HEALTHY
SendQueueSize              | 6904
RedoQueueSize              | 9544
SendRateKb                 | 5113
RedoRateKb                 | 41343
FileStreamSendRate         | 0
Secondary_lag_seconds      | 0
-[ RECORD 2 ]-------------------------
AGName                     | AG1
ReplicaServer              | db1
DbName                     | SALES
Failover_ready             | MANUAL
IsLocal                    | 1
IsPrimary                  | 1
SynchronizationState       | SYNCHRONIZED
SynchronizationHealthState | HEALTHY
SendQueueSize              | NULL
RedoQueueSize              | NULL
SendRateKb                 | NULL
RedoRateKb                 | NULL
FileStreamSendRate         | NULL
Secondary_lag_seconds      | NULL

secondary

-[ RECORD 1 ]-------------------------
AGName                     | AG1
ReplicaServer              | db2
DbName                     | SALES
Failover_ready             | MANUAL
IsLocal                    | 1
IsPrimary                  | 0
SynchronizationState       | SYNCHRONIZED
SynchronizationHealthState | HEALTHY
SendQueueSize              | 60
RedoQueueSize              | 29068
SendRateKb                 | 4421
RedoRateKb                 | 41462
FileStreamSendRate         | 0
Secondary_lag_seconds      | NULL

another thing you can do is read the logs for Always On info

 EXEC xp_ReadErrorLog 0,1,"Always"  
-[ RECORD 1 ]-------------------------
LogDate     | 2022-04-13 18:47:56.440
ProcessInfo | spid27s
Text        | Always On: The availability replica manager is starting. This is an informational message only. No user action is required.
-[ RECORD 2 ]-------------------------
LogDate     | 2022-04-13 18:47:56.590
ProcessInfo | spid27s
Text        | Always On: The availability replica manager is waiting for the instance of SQL Server to allow client connections. This is an informational message only. No user action is required.
-[ RECORD 3 ]-------------------------
LogDate     | 2022-04-13 18:48:18.900
ProcessInfo | spid53
Text        | Always On: The local replica of availability group 'AG1' is preparing to transition to the primary role. This is an informational message only. No user action is required.
-[ RECORD 4 ...


Uncategorized

  1. Trackbacks

  2. No trackbacks yet.
  1. Comments

  2. khailey
    | #1

    to do:
    check out
    https://github.com/shemminger/iproute2
    https://mirrors.edge.kernel.org/pub/linux/utils/net/iproute2/
    another iproute2 repository to see if it installs more easily
    Thanks to Jarod Still for the suggestion


8 − = six