Archive

Archive for December, 2015

Test Data Management (TDM) – your downfall or triumph?

December 29th, 2015

A recent post on LinkedIn had this image:

dev_test 

The problem is as old as development. The tester finds a bug but the developer can’t reproduce the bug so closes it as unreproducible.

… but how else can testing show development the bug? Should the developer come over to the testers desk and work on the testers systems while the tester waits?

The problem is that there is a difference in the developer’s environment and the tester’s environment. To solve this problem we have to track down why and then see if we can address it. The industry has been addressing many of these why’s. We are able to share exact machine copies with things like virtual machines where a whole machine can be exported as an OVA file (open virtualization archive). We made sharing applications and builds even faster and more efficient with things like Docker. We’ve made build configurations repeatable and automated with tools like Vagrant, Chef, Puppet, Ansible etc.

Screen Shot 2015-12-29 at 3.14.54 PM

Data is the problem

What we haven’t done is make sure the data sets in the developer and QA environments are the same. Insuring the same data has historically been deemed too costly, too slow and in some cases simply not possible. It’s not possible, or so it seems, to be able to guarantee that data sets are the same especially when these data sets are multi terabytes in size especially if you are talking about DevOps and/or continuous delivery where we might want to refresh the testing data set multiple times a day after each destructive test and those tests might need to be run multiple times a day if doing continuous delivery.

Anti-Pattern

What companies have done instead of sharing full exact copies of production data in development and testing is to instead use subsets of data or small copies of data that are easier to provision and that are suppose to be representative of production data. The number of bugs found at each stage, development, testings, UAT and production varied. It generally looks like this (these numbers are taken from an actual customer – large financial institution base out of New York)
Screen Shot 2016-01-06 at 4.23.46 PMScreen Shot 2015-12-29 at 1.35.43 PM

The problem is that each of these stages (dev, test, uat, prod)  lasts a different about of time. Development and testing tend to last  several months and the user acceptance testing on a full size copy of production is done near the end of the development cycle like:

Screen Shot 2015-12-29 at 2.08.50 PM

Where UAT testing on a full copy of production usually happens near the end of the development cycle. Now if we look at the number of bugs found we see that most bugs are found in the UAT testing but often UAT testing is only scheduled for a couple of weeks before release but so many bugs are found that it requires either extending the release date or releasing code with bugs.

Another problem is that the longer it takes to address a bug the more expensive it is to fix the bugs. Not only is it more expensive but the expense goes up exponentially. This was pointed out in Barry Boehm in his book Software Engineering Economics back in 1981:

Screen Shot 2015-12-29 at 2.35.30 PM

 

Here is a recent paper by Dynatrace showing the exponentially increasing cost of fixing bugs:

unnamed

 

Chart by Dynatrace from their paper “Goodbye war room, hello DevOps 2.0

It makes sense that the longer one waits to fix a bug the more expensive it is. I wonder if you’ve ever been in the position, or can imagine, when writing some code, if the code is run immediately and it returns an error, you know what we were writing, why you were writing it and what lines you had just added, so it’s relatively easy to identify, understand and fix the bug. Now on the other hand imagine if a tester reports the problem 2 months from now. Now you will have to remember why you were writing that code, what that code did, how that code worked and how it could have a bug. It takes longer. It’s harder. Now of course we all know how expensive that bug can be it hits production and not just the developer has to get involved but a whole teams of people have to get involved to track it down and worse yet if there is some business data corruption or loss.

Solution

The solution is easy. The solution is Data as a Service (DaaS) where data is treated like code and can be branched, bookmarked, shared, versioned etc. DaaS has a foundation on virtual data. Virtual data is data that shares duplicate blocks with other copies. The shared data is never modified. Modified data is written to a new location, avoiding over writing the existing data, and the new data is only visible to the virtual copy the change was made on. With DaaS a tester can find a bug and not only have the exact code version but also can bookmark the exact data state to reproduce the bug.

Applications can involve several layers such as the application stack, the database, files system configuration files etc. Complex applications can involve an application stack on top of more than one data source. Getting a copy of all of these at the same point in time and then sharing that data can be almost impossible unless you have DaaS. With DaaS, one can track multiple different layers and bookmark them all at the same point in time. For example a developer or tester can be given a “container” that contains the application stack, the data files and the database binary distribution. A tester could bookmark the container and pass that bookmark to a developer giving the developer not only the correct version of code but also the correct version of the database and data to reproduce the bug.

With DaaS we’ve seen bugs being discovered earlier by development and the number of late stage bugs is decreased.

Screen Shot 2016-01-06 at 4.23.46 PMScreen Shot 2015-12-29 at 2.57.12 PM

Find bugs early not only keeps bugs out of production but also makes it cheaper to fix the bugs as the bugs are fixed earlier where it’s cheaper

 

Screen Shot 2015-12-29 at 3.01.03 PM

 

By investing in DaaS  we can shift left, reducing costs, shortening timelines, improving quality and pulling in more revenue earlier.

Screen Shot 2015-12-29 at 3.05.58 PM

You may say “that’s great and fine for you but I don’t have DaaS technology at work and even if I did, it sounds complicated”

Well today is your day as Delphix supplies DaaS simply with point and click of a mouse or a CLI or restful web services and there is even a free version of Delphix called “Delphix Express” that comes with a full demo lab that you can download and play with on your laptop, Mac, Windows, Linux etc:

http://datavirtualizer.com/delphix-express-installation/

Screen Shot 2015-12-29 at 3.15.28 PM

Uncategorized

ASH art on Twitter profiles

December 28th, 2015

What fun to see Top Activity aka ASH artwork on twitter profiles:

Screen Shot 2015-12-28 at 1.33.45 PM

Screen Shot 2015-12-28 at 1.32.59 PM

Uncategorized

WordPress and Docker

December 3rd, 2015

One of the attractive things about Docker to me, is that I can spin up containers with software that doesn’t affect the rest of my system. I can spin down the container, remove the container, make multiple containers. It’s all nice neat and clean unlike traditional application installation on an OS where files are spewed all through the file system and it’s a  mess to try and clean it it up and/or remove it if need be. Sort of like the different between brain surgery and removing or putting on a piece of clothing.

In the last couple of posts I’ve talked about using WordPress with Delphix and Docker. One of my desired use cases was to be able to spin up multiple wordpress containers as in

Screen Shot 2015-12-02 at 3.37.42 PM

The above diagram shows Delphix but that’s beside the point. The point is that I want, and should be able to, run multiple docker wordpress container. In actuality I never got this working correctly. My problem was running multiple wordpress docker containers at the same time. Actually running them was fine. The problem was accessing the websites for each of these wordpress containers. There are multiple wordpress docker images available. There is the official wordpress docker image and then there are a number of other popular images. The two other I tried were “tutum/wordpress” and “centurylink/wordpress”.

Official WordPress Docker image

Then start docker and download the “wordpress” container

service docker start
docker pull wordpress

Start docker wordpress container

docker run -p 80:80 --name wordpress1  \
           -e WORDPRESS_DB_HOST=172.16.160.160:3306 \
           -e WORDPRESS_DB_USER=wordpressuser \
           -e WORDPRESS_DB_PASSWORD=password \
           -d wordpress

Start a second Docker WordPress container, using a different MySQL database and mapping port 80 to port 81

docker run -p 81:80 --name wordpress2 \
           -e WORDPRESS_DB_HOST=172.16.160.160:3307 \
           -e WORDPRESS_DB_USER=wordpressuser \
           -e WORDPRESS_DB_PASSWORD=password \
           -d wordpress

And accessing each of these actually works. Just go to the host URL for the first one and go to the host URL:81 for the second.  Wordpress by default is set up on port 80 which is the default port so to access wordpress all you have to do is give the host name as the URL. In the case of the second WordPress container, we now access WordPress on port 81.

The problem is that after we access the second wordpress website via port 81, the URL gets re-written without the port and thus we start accessing the first wordpress container.

I tried to turn off URL re-writting by running

a2dismod rewrite

and bouncing apache

/etc/init.d/apache2 start
/etc/init.d/apache2 stop

an editing the file

/var/www/html/.htaccess

to have “RewriteEngine Off” but no dice. The URLs kept getting rewritten, thus I decided to try to use a different wordpress docker container image,

WordPress image from tutum

docker pull tutum/wordpress
docker run -p 80:80 --name wordpress1 \
           -e WORDPRESS_DB_HOST=172.16.160.161:3307 \
           -e WORDPRESS_DB_USER=wordpressuser \
           -e WORDPRESS_DB_PASSWORD=password \
           -d tutum/wordpress

but this seems to create a MySQL database itself instead of ysin gthe one I specified. Looking at the run.sh on github, https://github.com/tutumcloud/lamp/blob/master/run.sh, it looks like it’s hard coded to look for MySQL datafiles on location, so this is not going to work with multiple virtual databases.

WordPress image from centurylink

docker pull centurylink/wordpress
start first container
docker run -p 80:80 --name wordpress1 \
           -e DB_PASSWORD=delphixdb \
           -e DB_1_PORT_3306_TCP_ADDR=172.16.160.161 \
           -e DB_1_PORT_3306_TCP_PORT=3307  \
           -d centurylink/wordpress

but it couldn’t connect to my MySQL database.  To see the errors you can run

docker logs wordpress1

The default user for MySQL in this image is root and root requires some extra set up. Trying to access MySQL directly as root gives an error

mysql -uroot -pdelphixdb -h172.16.160.161 -P3307
Warning: Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user 'root'@'linuxtarget' (using password: YES)

To allow root access requires connect to MySQL via the socket file:

mysql -u root -pdelphixdb  --socket=/u01/app/toolkit/provision/V/u02/app/mysql/data-NWO/temp/mysql.sock

To find the socket file, go to the location that Delphix mounts the datafiles onto the host and searching for mysql.sock. After connecting, then tell the MySQL database to allow socketless root connections

GRANT SUPER ON *.* TO 'root'@'%' IDENTIFIED BY 'delphixdb';
GRANT ALL ON *.* TO 'root'@'%' IDENTIFIED BY 'delphixdb';
flush privileges

Now I can start two wordpress containers

docker run -p 80:80 --name wordpress1 \
           -e DB_PASSWORD=delphixdb \
           -e DB_1_PORT_3306_TCP_ADDR=172.16.160.161 \
           -e DB_1_PORT_3306_TCP_PORT=3306  \
           -d centurylink/wordpress

 

docker run -p 81:80 --name wordpress2 \
           -e DB_PASSWORD=delphixdb \
           -e DB_1_PORT_3306_TCP_ADDR=172.16.160.161 \
           -e DB_1_PORT_3306_TCP_PORT=3307  \
           -d centurylink/wordpress

The problem now is that when I access the wordpress website, the page comes out blank with no errors. Haven’t tracked down why this isn’t work.

Summary

At this point I haven’t managed to create multiple wordpress containers and clearly access each website separately.

Next step is to probably create my own wordpress docker image set up like the official wordpress docker image but without URL rewrite. TBA

A few useful commands I ran into along the way

# docker ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS                NAMES
113559099a60        wordpress:latest    "/entrypoint.sh apac   6 seconds ago       Up 4 seconds        0.0.0.0:80->80/tcp   wordpress1

list the running images. From this list I can stop them, remove them, see logs from them etc. For example

# docker logs wordpress1
WordPress not found in /var/www/html - copying now...
WARNING: /var/www/html is not empty - press Ctrl+C now if this is an error!
+ ls -A
index.html
+ sleep 10
Complete! WordPress has been successfully copied to /var/www/html
Warning: mysqli::mysqli(): (HY000/1045): Access denied for user 'wordpressuser'@'172.16.160.160' (using password: YES) in - on line 10

stop/start/delete

docker stop wordpress1
docker start wordpress1
docker stop wordpress1
docker rm wordpress1

getting a shell into one of the containers

docker exec -i -t wordpress1 bash

get a list of downloaded images

# docker images
REPOSITORY              TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
wordpress               latest              9909dec6d65f        9 days ago          514.8 MB
tutum/wordpress         latest              5025a6da41dd        4 months ago        493.6 MB
centurylink/wordpress   latest              91f5520cafc8        7 months ago        520.1 MB

get details on a container (output is truncated as it is long)

# docker inspect wordpress1
[{
    "AppArmorProfile": "",
    "Args": [
        "apache2-foreground"
    ],
    "Config": {
        "AttachStderr": false,
        "AttachStdin": false,
        "AttachStdout": false,
        "Cmd": [
            "apache2-foreground"
        ],
        "CpuShares": 0,
        "Cpuset": "",
        "Domainname": "",
        "Entrypoint": [
            "/entrypoint.sh"
        ],
        "Env": [
            "WORDPRESS_DB_HOST=172.16.160.160:3307",

Mapping a local directory into a container with “-v” option which makes sharing files with the container easier and allows for persistent storage across container creation and deletion

# docker run -p 3142:80 --name wordpress1 -v /tmp/wordpress:/tmp/container/wordpress            -e WORDPRESS_DB_HOST=172.16.160.161:3306            -e WORDPRESS_DB_USER=wordpressuser            -e WORDPRESS_DB_PASSWORD=password            -d wordpress

and this will map local host directory /tmp/wordpress to the container directory /tmp/container/wordpress. Mapping can be limited to a single file and constrained to read only instead of the default read/write.

ATTENTION: another useful bit of knowledge is changing the wordpress siterul in the new container:

mysql -u wordpressuser -ppassword -h 172.16.160.161 -P 3306 wordpress << EOF
update wp_options set option_value=’http://172.16.160.161′ where option_id<3;
select option_value,option_id from wp_options where option_id < 4;
EOF

 

Uncategorized

Docker and Delphix architectures

December 2nd, 2015

In my last post I showed using Docker and Delphix to support WordPress.

I use wordpress for this blog. Works fine on it’s own. It is just me making a few posts here and there. Occasionally there are problems like an upgrade that goes bad or a hack that get’s some redirection code into the site. In those cases I have to go to a backup of my MySQL database that is used by wordpress on my site. The database is small so it’s pretty quick to backup, but I don’t back it up normally.  I know I should and occasionally it would be nice to have a backup in the event that data is corrupted somehow (like hacks into the contents of the database).

WordPress uses MySQL for the data store and the wordpress content changes are all stored in MySQL. MySQL data can be linked to Delphix which automate data management for MySQL (or any data) by providing backups, versioning and fast thin cloning of the data to be used in development and QA.

Using WordPress as an example there are a number of architectures we could use. First we don’t need Delphix or Docker and could just set it up, as have with this blog,  as

Screen Shot 2015-12-01 at 1.26.20 PM

One weakness of this architecture is any changes to the wordpress website are being made on the source. Why is that a problem? It’s a problem if something goes wrong when deploying changes. How do you rollback any incorrect changes? What happens if multiple developers are working on the wordpress site? Is there any way to version changes to keep changes by one developer separate from another?

I just use wordpress for my personal blog but what if you used it for your business and what if multiple people were making changes. In this case, ideally I want to make changes on a staging site and when changes are validated then push them to the production site.

Ideally development on the wordpress site is done on a staging or development server.

Screen Shot 2015-12-01 at 3.48.12 PM

Question is, how do you keep the data in the development host in sync with the source host and how would you roll changes from development into the source? One answer for deploying changes would be to use something like RAMP. So we can use something like RAMP to push the changes to production but how push changes in production back to the staging environment?  What about data coming into production such as comments, feedback, forms etc? How do we get that data back to the development environment? That’s were Delphix shines

Screen Shot 2015-12-01 at 3.33.01 PM

Delphix connects to the MySQL database on production and syncs all the data changes onto Delphix providing a timeline (down to second) of changes. These versions of the database can be provisioned out to a target host via what is called “thin cloning”. When a thin clone is made, data is not moved or changed. Instead we just make an image of the data at the point in time available to the database instance. The data is mounted over NFS or iSCSI. The only thing that gets stored in these thin clones are changes made to the thin clone and those changes are only visible to the clone that made the changes. This architecture provides two things

  1. Backups down to the second of production for a multiple weeks generally stored in less than the size of the original database thanks to compression and deduplication and accessible in a matter of minutes.
  2. Thin clones of the data providing as many copies to as many developers as we want for almost free.

Point 1, backups, is a huge piece of mind. Once Delphix is connected to the production/source database then backup is automatic  and recovery is a no stress few clicks of the mouse.

Point 2 supports a more robust development environment like

Screen Shot 2015-12-01 at 3.51.19 PM

In this environment I can have multiple target hosts where developers can each work on their own private copy of the production database and thus website. We can even have extra copies to test merging of changes from different developers. What happens though if we want all the developer copies on one machine like:

Screen Shot 2015-12-01 at 3.52.53 PM

The problem with this, is I don’t know how to run multiple instances of wordpress on one machine. An easy solution would be to use Docker containers such that each instance of wordpress is separate from the others as in

Screen Shot 2015-12-01 at 3.56.09 PM

Docker containers are self contained and don’t impact each other (except potentially on a resource consumption level like CPU).

Docker containers are also quick to spin up allowing quick failover, when used in conjunction with Delphix, like

Screen Shot 2015-12-01 at 4.01.41 PM

Finally we could combine the architectures to support quick fail over, recovery, versioning, multiple developer environments like

Screen Shot 2015-12-01 at 3.58.59 PM

In this case, our production MySQL database is using data directly off of Delphix. This allows us to quickly rollback any changes by simply using Delphix to rollback to an earlier version of the database. Or we could promote a developer copy directly to production. And if the host went down, we could fail over to another machine quickly by starting up a docker wordpress container there and provisioning it with a thin clone in minutes from Delphix.

 

 

 

 

 

 

 

 

 

 

Uncategorized