Credit: Matthias Weinberger (CC2)
What is Devops?
As has been noted many of times, DevOps is a term that is hard to define.
DevOps is a term that is so easily misapplied in the enterprise that it’s popularity threatens adoption.
If you can’t define it for the bean counters – it might as well not exist at all.
– Tim O’Brien, Gleanster
Thus being able to clearly define DevOps is crucial to its success.
DevOps is a goal
One of the problems in defining DevOps is that no two DevOps teams are doing the same thing, so defining DevOps by what they do is currently impossible. On the other hand all DevOps teams have the same goal. That goal has been clearly articulated by Gene Kim as
DevOps enable fast flow of features from development to IT operations to the production.
One could expand a bit more and say
DevOps enables Development, QA, IT Operations, Information Security and Product Management to work together to enable the fast flow of features from the business, to development, to QA, to IT to the customer, while preserving the stability, availability, security, and manageability of the production environment.
How is the DevOps goal obtained ?
Everyone is in agreement as to the goal but the 6 million dollar question is how does one attain the goal? What tools and processes ? There is a gnawing lack of discussion on what are the best tools and processes required to achieve the DevOps goal of fast flow of features. In order to increase flow in any system, the only productive action to take is to improve the constraint. The constraint is the slowest point in the flow. Any improvement not made at the constraint is an illusion as demonstrated by the works of Eli Goldratt. Thus the first step in DevOps is to identify the top constraint(s) and optimize them.
Gene Kim who has worked with 100s of CIOs and surveyed 14,000 companies listed the top 5 constraints in the fast flow of features as
- Provisioning environments for development
- Setting up test and QA environments
- Architecting code in development to facilitate easy changes in code
- Development speed of coding
- Product management feature input
The only way to optimizer the fast flow is to take the top constraint tune it, then move to the next constraint. The first 2 constraints that are most commonly seen are both centered around environment provisioning. As Gene Kim puts it
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it. – Gene Kim
Thus the question for most companies is how to improve the provisioning of environments for development and QA.
Why is environment provisioning such a bottleneck?
Environment provisioning is such a bottleneck because historically environments take a long time and many resources to create and provision and developers depend on these environments to code the applications. QA depend on these environments to test the code . IT depend on these environments to verify deployment processes. When environment provisioning can be done quickly, repeatedly and efficiently then one can accomplish the core of DevOps: Continuous Integration and Continuous Deployment. Continuos Deployment is an automated process where developers code check-ins are integrated multiple times a day, run through QA and verified in IT for correct deployment. Some companies take it even farther with Continuos Delivery where the changes that pass testing are rolled via automation into production.
Continuos Deployment enables the fast flow of features from development, to QA, to UAT, to IT (and even to production in the case of Continuos Delivery).
What tools enable Continuos Deployment ?
Continuos deployment depends on automation and efficient environment provisioning to work, thus tools that automate code management, automate environment provisioning and improve efficiencies are key.
Minimum prerequisites for continuous integration
- version control system like GIT
- test automation like Jenkins
- environment provisioning
Version control can be implemented with any number of tools, of which one of the most popular now is Git.
Test automation can such as Jenkins and Team City can orchestrate running tests when changes are introduced in source control repositories.
But to move to continuous integration where code changes can be QA’ed and verified for deployment multiple times day requires efficient provisioning of QA and verification environments.
To reach the goal CI/CD of multiple deployment tests at day requires not only integrating code checkins in source control with automated test management tools but it also requires fast efficient access to environments to run those tests.
One of the first decisive technologies that came along to support efficient environment provisioning was machine virtualization such as VMware. With machine virtualization, virtual machines (VM), could be spun up or down quickly and easily thus providing the efficiency gains needed for DevOps and opening up machine provisioning to automation now that it was controlled by software.
Configuration tools automate VM provisioning
Configuration tools like Chef, Puppet and Ansible are programmed to automate and manage the provisioning of VMs with the desired configurations, packages and software.
Continuous Integration (CI/CD) Landscape
Thus to implement CI/CD requires
- version control system like GIT
- test automation like Jenkins and Team City
- efficient environment provisioning
- Machine virtualization (VM) : VMware, KVM, Docker etc
- VM provisioning tools : Chef, Puppet, Ansible
Biggest Obstacle to CI/CD
Now that code changes are automatically QA’ed and VMs can be spun up in the right configuration, there is still the problem of “how do we get data to run QA tests on?” For example what if the application depends on data managed by a large Oracle database? How can one efficiently produce copies of this database for use in the flow from development to testing to deployment ?
Fortunately there is a technology called data virtualization. As virtual machine technology opened the door to continuos integration, data virtualization swings it wide open for enterprise level application development that depend on large databases.
Data virtualization is an architecture (that can be encapsulated in software as Delphix has done) which connects to source data or database, takes an initial copy and then and forever collects only the changes from the source (like EMC SRDF, Netapp SMO, Oracle Standby database). The data is saved on storage that has either snapshot capabilities (as in Netapp & ZFS or software like Delphix that maps a snapshot filesystem onto any storage even JBODs). The data is managed as a timeline on the snapshot storage. For example Delphix saves by default 30 days of changes. Changes older than the 30 days are purged out, meaning that a copy can be made down to the second anywhere within this 30 day time window.
Thus to implement CI/CD requires efficient environment provisioning. Efficient environment provisioning depends upon tools and technology such as:
- Configuration Management: Chef, Puppet, Ansible
- Machine virtualization : VMware, KVM, Docker etc
- Data virtualization : Delphix
Once environment provisioning is streamlined then the next step is automating the flow of developer commits through QA and deployment testing with test automation tools like like Jenkins and Team City.
There are examples coming out to document how to wire these tools togethers such as a recent one on Jenkins and Ansible.
Efficient environment provisioning and test automation are just the core. From there we can build out more of the processes to attain the goals of DevOps.
The ambiguous descriptions of DevOps are undermining the movement, but DevOps is defined as the tools and culture that support a continuous delivery value chain. Now with a stake in the ground to rally around, the goal is no longer saying what is or what is not DevOps but instead becomes mapping out the best tools, processes, methods and cultural changes to support the DevOps definition and mapping these to the various situations encountered in the industry. Everyone is doing DevOps, i.e. software development, integration, QA and release, the questions is how well are you doing it and can you improve it?
Virtual data improves businesses’ bottom line by eliminating the enormous infrastructure, bureaucracy and time drag that it takes to provision databases and data for business intelligence groups and development environments. Development environments and business intelligence groups depend on having a copies of production data and databases and data virtualization allows provisioning in a few minutes with almost no storage overhead by sharing duplicate blocks among all the copies.
Here is an episode from Arrested DevOps talking about the problem with data and databases in DevOps