This article originally appeared in Virtual Strategy Magazine. It’s reprinted here in its entirety.
IT departments today are rapidly deploying virtualization technologies – in fact, Gartner reports that server virtualization is already over 60% penetrated and projects the market will be over 80% penetrated by 2016. With this rapid rise in virtualization deployment, customers are challenged to incorporate data protection and archive methodologies for their virtualized data.ESG recently indicated that 60% of virtualization technology users are planning to address challenges associated with data protection for their virtualized data as a top priority for 2013, an astonishing number. There are certainly lots of options available to protect traditional data, but virtualized data is a different beast. Another Gartner report identifies that over 1/3 of organizations will change backup vendors due to factors including cost, complexity, or capability. Based on what I’ve heard talking with customers, I would add one more factor: compatibility. In this article we will explore all four of these areas and suggest ways to overcome the challenges associated with virtualized data protection.
Steve Duplessie, ESG Founder and Senior Analyst, recently confessed that he is “still amazed that mundane and routine things in IT, such as data protection, are still huge issues for customers.” Adding virtualization into the mix is not necessarily going to make things any easier. When customers first consider protecting their virtual data, they may think about simply using an existing traditional backup application to obtain their data protection goals, but there are some cautions against that approach. Traditional backup software can quickly become too costly to manage virtual data. Most traditional applications charge on an agent basis for each host or sometimes for each virtual machine (VM), causing unnecessary licensing and maintenance fees, while also increasing time spent managing and maintaining these additional licenses. In addition, without an agent for these hosts, that precious data could get missed on the next backup.
Most backup software, when looking to protect applications such as Exchange, SQL, and SharePoint, or to undertake other data protection tasks (such as tape backup or replication) are going to require additional plug-ins and or licenses. There are models now where capacity-based licensing is utilized by traditional backup software. This model may be great for some larger shops, but this can also cause costs to skyrocket quickly while providing some features/functions that may not be needed. It is important to think about backup applications that take a simpler approach to licensing by removing agents and plug-ins, or charging for unnecessary features. Not only do these types of applications typically cost less, but their deployment methodology can reduce overhead and eliminate the need to keep track of licenses. Most importantly, they ensure that all of a customer’s data is captured – not just they data they are licensed to capture.
Backing up more data than necessary will drive costs up, so increasing data through the addition of VMs into a backup scheme can drive up costs exponentially. VMs generate a lot of extraneous data, so some hypervisors have been introduced with cool and unique ways to tackle this challenge. Look at features in VM specialty tools that not only take advantage of these features from the hypervisor, but also help to eliminate this extraneous data. Using deduplication technologies can further reduce the overall footprint on premise, as well as the data sent and residing in the cloud for DR and off-site backup.
Finally, additional costs associated with bringing virtual data into the mix can include more physical servers (or backup servers), more storage hardware for storing this data may be required, and any additional technical support costs from the publisher for servicing virtual data. On the physical backup server side, there are backup applications that are specifically built to run as virtual appliances, taking full advantage of not only the existing infrastructure, but also all of the inherent values associated with being a virtual machine, such as vMotion, vStorage Motion.
Coping with Complexity
VMs by nature are dynamic structures, creating all kinds of havoc in the relatively controlled world of data protection. There are many factors that cause this complexity, particularly VM sprawl. VMs are so easy to deploy that even employees who don’t know or care about data protection can create them, causing IT managers unnecessary pain.
Some key issues to think about: How well does the backup application integrate with the hypervisor? There are applications that can work directly with the hypervisor to run inventory and auto-discover any new VMs or hosts that have been created/added since the previous backup. This will ensure that data is protected, no matter how those VMs were created. There are also serious performance issues to consider that can negatively impact an environment. Backup applications run on scheduled times, regardless of what else may be happening in that environment, and these jobs require resources (network, disk I/O, compute power, etc), and applications that are more geared towards virtualized environments. These applications can typically deploy as an OVF template (virtual appliance) and use fewer resources, but can also be vMotioned off as their duties are winding down to free up those resources needed by the host for other VMs that are doing more important functions.
It’s also important to understand how easy it is to perform restores with an application. If a proprietary format is being used (as is typical of the majority of applications) then only that backup application has access to the data, dramatically reducing restore capabilities and causing unnecessary resource contention. These applications will need to deploy a virtual helper in order to un-package the data so that the hypervisor or database can then understand what it is receiving. Backup applications that can protect data in their native format (explored later in this article) are more flexible by nature and eliminate the need to boot the VMs prior to browsing, searching or restoring individual files.
Capabilities – What’s Necessary?
Scalability of any solution for protecting virtual data is a major concern to consider. Virtualization isn’t slowing down in the industry, and chances are most environments reap the benefits of virtualization technology, and they require a data protection solution that scales, not just from a support perspective, but on the fly as virtual data grows. VM sprawl and extraneous data raises costs of a data protection scheme, but if the data protection scheme can’t grow as the same pace as the VM data, what good is the data protection scheme?
IT managers need a system that will scale easily, so data never goes unprotected. Traditional applications were built around protecting physical data, so their heritage may not have the features or interoperability with virtualization data – or the hypervisors – themselves to get the job done. Solutions that have direct ties to the hypervisor are certainly critical for a virtual data protection solution. Some hypervisors have unique ways to reduce extra data associated with their VMs; however, it’s important that a data protection solution also auto-discover any new VMs so no data goes unprotected. If the VM itself or the datastore associated to that VM moves, will the backup application still protect that data? A data protection system should protect not only virtual data, but physical data that may not be growing, but is certainly not going away. Specialty applications for virtual data and traditional software for physical data is the best approach. However, it may be possible for some specialty application to work in conjunction, or present, virtual data to the traditional apps, so that all the existing backup mediums, backup application, policies and procedures investment do not go to waste.
The final capability to consider: does a solution have backup medium flexibility, such as being able to protect to tape and cloud or whatever the next best thing will be? Data protection solutions that can take advantage of any backup medium for not only backup, but also for archiving VMs and any associated data will have the most success. Also, make sure that any application or solution in place can support not only disk and cloud, but also tape. Tape is still the preferred medium for backup and archive for almost 80% of organizations.
Compatibility can mean any number of things. In this context we are talking specifically about the type of format that the backup application “protects” data to its target destination. Most applications will use a proprietary format that only that application has access to the data, virtually eliminating any flexibility in backup and DR choices. But it is more than that. Think about having a backup application that can protect data in the native format to the actual VM and the associated data, eliminating any vendor lock-in and allowing for the data to now be truly an asset to not only the end-user but also to the IT Organization. As we think about ITaaS, having data in a proprietary format will limit the use of this – the backup application can quickly become a boat anchor since only that application can access that data. With applications that use native format, this data can have virus scans run against it, security and compliance checking, and indexing engines can be used in conjunction with the data. More importantly, native format applications preserve freedom of choice on backup medium and other applications that may need to use the data 2, 5 or 10 years from when it is first stored. Even if the choice is never exercised, it’s good to know it’s there.
As virtualization data continues down the path of world dominance, customers will continue to realize the positive benefits that virtualization offers, but will need to continue to learn how to best protect this new data structure. Using traditional applications to protect virtual data may be enough to get by today, but will it be a complement to the virtualinfrastructure or will it be at odds? Customers should be aware of how and why virtual data differs from physical data, but should not be afraid to look at what options are available for protecting it. There are applications available that are built and designed around traditional data protection methodologies that have been somewhat adapted to protect virtual data, but there are also applications that have been built around virtualization technologies with respect for how these environments operate, and can take advantage of functions and flexibilities of virtual infrastructures. Each brings their strengths and weaknesses – the good news is there are options.
To learn more about Quantum’s line of DXi reduplication appliances and software – to download a free version of vmPRO, check out http://www.quantum.com/DXi