Top 3 criteria to choose a virtual data solution
Data virtualization solutions also known as Copy Data Management (CDM), Virtual Copy Data (VCD) and Virtual Data Appliances (VDA) are rising rapidly as over 100 of the Fortune 500 have adopted data virtualization solutions between 2010 and end of 2015. Adoption is hardly surprising given that virtual data reduces the time to provision copies of large data sets from days down to minutes and eliminates most of the space required for copies of data. How many copies of large data sets do companies have? Database vendor Oracle claims that on average a customer has 12 copies of production databases in non-production environments such as development, QA, UAT, backup, business intelligence, sand boxes, etc and Oracle expects the number of copies to double by the time their latest version of Oracle, 12c, is fully adopted. With Fortune 500 companies often having 1000s of databases and these databases reaching multi terabytes in size, the down stream storage costs of these data copies can be staggering.
- What unique features does each vendor provide to help achieve my business goals?
- Does the solution support the my full IT environment, or is it niche/vendor specific?
- How much automation, self-service and application integration is pre-built vs. requires customization?
- Are their customers similar in size and nature to myself using the solution?
- Is the solution simple and powerful or just complicated?
- Is the solution addressing your business goals
- Is the solution supporting your entire IT landscape
- Is the solution automated, complete and simple
- Storage savings
- Application development acceleration
- Data protection & production support
Storage savings
All data virtualization solutions offer storage savings by the simple fact that virtual data provides thin clones of data meaning that each new copy of data initially takes up no new space. New space is only used after the data copies begin to modify data. Modified data requires additional storage.
Comparing storage savings
To compare the storage savings of various solutions find out how much storage is required to store new modifications and how much storage is required to initially link to a data sources. Of the solutions we’ve looked at the initial required storage ranges from 1/3 the size of the source data up to 3x the size of the source data. Of the solutions we’ve looked at some can store new modified data in 1/3 the actual space thanks to compression. Other solutions don’t have compression and some solutions have to store redundant copies of changed data blocks.
Data agility more important that storage savings
Application development acceleration
User friendly self service interface
Data virtualization solutions can provide powerful data protection. For example if someone corrupts data on production such as dropping a table or a batch job that only half completes modifying some data but not all data before erroring out, a virtual database can be spun up in minutes and the uncorrupted data exported from the virtual database and imported into the production database. We have heard numerous stories of the wrong table being dropped on production or a batch job deleting and/or modifying the wrong data with the changed propagated immediately to the standby thus being unrecoverable from the standby.
Data virtualization can save the day recovering the data in minutes. Data virtualization can offer impressively fine grain and wide time windows for Recovery Point Objects and fast Recovery Time Objectives.
Time window size and granularity
Finally look into how easy or difficult it is to provision the data required. If the data required is a database then provisioning the data can be a complicated task without automation. Does the solution offer a point and click provisioning of a running database down down to the second at a past point in time? How easy or difficult is it to chose the point in time from which the data is provisioned? Is choosing a point in time a simple UI widget or does it require manual application of database logs or manual editing of scripts?
2 . Support your entire IT landscape
Is the solution software running on any hardware or does it require specialized hardware? Does the solution use any storage system in your IT landscape or is it restricted to specialize storage systems? Will the solution lock you into a specific storage type or will it allow full flexibility to use new storage types as they become market leaders such as new, better and more affordable flash storage systems. Does your IT landscape use the cloud and does the solution support your IT department’s cloud requirements?
- Oracle
- Oracle RAC
- SQL Server
- MySQL
- DB2
- Sybase
- PostGres
- Hadoop
- Mongo
Does your IT landscape require data virtualization for any of the following and does the solution automate support for these data types
- web application tiers
- Oracle EBS
- SAP
- regular files
- other datatypes
Does your IT landscape use and does the solution support all of your operating system types
- Linux
- HP/UX
- Solaris
- Windows
- AIX
3. Fully Automated, Complete and Simple
How automated is the solution? Can an end user provision data or does it required a specialized technician such as a storage admin or DBA? When provisioning databases such as Oracle, SQL Server, MySQL etc does the solution fully and automatically provision a running database or are manual steps required? For example some solutions only provision data from a single point in time from the data source. What if a user requires a different point in time? How much manual intervention is required? Some solutions only support provisioning data from specific snapshots in the past. What if a user requires a specific point in time in the past that is between snapshots. How much manual intervention is required? Does the solution collected changes automatically from the data source or does the solution require some other tools or manual work to collect changes from the source or get newer copies of source data?
Simple
Does the solution come with an alerting framework to make administration easier?
Does the interface come with a “single pain of glass” to expand to 1000s of virtual data copies across potentially 100s of separate locations in your IT landscape?
Is it easy to add more storage to the solution? Is it easy to remove unneeded storage from the solution
In Summary
Find out how powerful, flexible and complete the solution is.
Is the solution a point solution or a complete solution ?
Some solutions are specific point solutions for example only for Oracle databases. Some solutions are point solutions to specific hardware or storage systems while others are complete software solutions. Complete flexible solutions sync automatically with source data, collect all changes from the source providing data provisioning down to the second from anywhere within that time window, will support any data type or database on any hardware and support the cloud.
Does the solution provide self service and user functionality?
- Point-in-time provisioning
- Reset, branch and rollback of environments
- Refresh parent and children environments with the latest data
- Provision multiple source environments to the same point in time
- Automation / self-service / auditing capabilities
Some simple technical differentiators
- support your data and database types on your systems and OS
- support your data center resources or require specialized hardware or storage
- sync automatically with source data or does it leave syncing as a manual exercise or require other solutions
- provision data copies down to the second from an extended time window into the past
- branch virtual data copies
- cloud support included
But when it comes down to it, even after asking all these questions, don’t believe the answers alone. Ask the vendor to prove it. Ask the vendor to provide in house access to the solution and see how easy or hard it is to install, manage and execute the functionality required.
For more information also see
- Top 10 evaluation criteria for copy data management & data virtualization
- Snap Clone for Oracle only
Trackbacks
Comments