Cleaning up SMVI snapshots in vSphere

For whatever reason, sometimes VM snapshots get stuck and ultimately forgotten. As you know this can be disastrous as once that line in the sand has been drawn, the resulting redo log will continue to grow until its underlying disk is completely exhausted. One-off manual snaps can be easily forgotten but worse is when programmatic snaps, the likes of NetApp SMVI or VCB, don’t get removed cleanly and start to stack up. I just had this problem when my SMVI, for reasons unknown, stopped removing snaps from one of my volumes and started incrementing them. At the point I caught the problem, some of my VMs on this volume had as many as 5 SMVI snapshots! Not good. SMVI is a great solution overall that works really well, but its handling and reporting of VI snapshots could be a lot better.



So now I have exposed a problem in my environment. The performance of my VMs is suffering and the integrity of my backups could be questionable. The first step is to create a new alarm in vCenter that watches VM snapshot size. I will warn on anything over 1GB and alert on 2GB. While researching for solutions to this problem (and manually deleting snapshots) I came across a tool written by one of NetApp’s Architects last year. Even though it was written by NetApp specifically for SMVI, it is solid enough to be used in a number of other scenarios. CVMS (Cleanup VMware Snapshots) is an executable that can be run manually via CLI or called via a script. I my case I will call this tool via a script in SMVI.
Feed it the vCenter address, credentials, and snap name prefix, then you can scope by datastore, VM, or VM set. It will go in order and remove all snaps that match the defined prefix, one at a time. Manually-created user snaps will not be affected, unless of course the snap name matches your defined prefix. The tool is incredibly well written and provides just about every customization you would care to do in this scenario.


Sample command string and output:
C:\>cvms -vcuser administrator -vcpasswd passwd -vcip 10.2.1.2 snapname bar -ds test_ds -verbose
LOG REPORT FOR CVMS
-----------------------------------------------------
CVMS Version: 1.0
Log Filename: \NetApp\CVMS\Report\CVMS_20100131_143350.log
Start Time: Sun Jan 31 14:33:50 2010
Datastore(s) selected: test_ds
Command line arguments successful.
Initializing connectivity to Virtual Center and storage appliances.
Converting Virtual Center hostname to IP address ...
Attempting to ping Virtual Center 10.2.1.2 ...
Ping of Virtual Center 10.2.1.2 successful.
Creating new Virtual Center instance for 10.2.1.2 ...
Logging into Virtual Center server 10.2.1.2 ...
Virtual Center login successful.
Collecting VMware and storage appliance configuration data.
Collecting datacenter information ...
Found 2 Datacenter(s).
Collecting host system information ...
Host system information collected.
Looking on host system esx2.internal.net for datastore test_ds ...
Requested Datastore (test_ds) is available.
Saving virtual machine information for vm2.
Saving virtual machine information for vm1.
Cleaning up snapshots for all VMs listed ...
Checking snapshot capability of VM vm1 ...
Removing all snapshots with string 'bar' from VM vm1 ...
No VM snapshots found.
Checking snapshot capability of VM vm2 ...
Removing all snapshots with string 'bar' from VM vm2 ...
Removing VM snapshot 'bar2' ...
Removal of VM snapshot for vm2 successful.
Command completed successfully.
Backup End Time: Sun Jan 31 14:34:02 2010
Exiting with return code: 0
In my particular scenario, I backup per volume in SMVI so will add a custom script to each backup job to ensure that all snaps get properly cleaned up afterwards. To present a script to SMVI, it needs to exist in %PROGRAMFILES%\NetApp\SMVI\server\scripts (or <drive>:\Program Files (x86)). SMVI can use .bat, .cmd, .pl, etc. Here is the syntax of one of my volume clean scripts:
if not %BACKUP_PHASE% == POST_BACKUP goto end
set PATH="D:\Program Files (x86)\NetApp\CVMS"; %PATH%
cvms.exe -vcip vcenter –vcuser domain\account –vcpasswd password -ds Volume1 -snapname smvi -reportdir "D:\Program Files (x86)\NetApp\CVMS"
:end
Depending on how your backups are configured, you could run a script like this at the end of the day or like me, after every backup. I backup every 8 hours so have plenty of time for cleanup in between. The report directory will house text file outputs of each instance run with the same output you would see in the CLI using the –verbose switch. Refer to the SMVI 2.0 best practices guide for available variables that can be referenced in a script.
The NA community homepage for the tool is in the references below. CVMS is not in the NA Tool Chest however, I checked, so unless someone tells me otherwise I will host a mirror of the utility.

References:
CVMS (homepage)
CVMS (mirror)
Scripting SMVI cleanup
SMVI 2.0 Best Practices Guide

2 comments:

  1. hello i tried to cvms tool but i am getting no VM snaphot But if i browse datastore i can see the snapshot, any idea why it's not working

    ReplyDelete
  2. Same issue here, VSC obviously had created snapshots on VMs (-000001.vmdk, -000002.vmdk files in the datastore) that somehow failed to completely register with ESX and did not show up in Snapshot Manager, even though writes started being directed to them.
    Right-clic -> Snapshot -> Consolidate still managed to get rid of them.

    ReplyDelete

Powered by Blogger.