Wednesday, June 18, 2014

System Center Data Protection Manager 2010 Hyper-V protection: Configuring cluster networks for CSV redirected access

When System Center Data Protection Manager 2010 (DPM) is protecting Hyper-V guests using the Microsoft Software Shadow Copy provider (the VSS provider), DPM will be using software snapshots when backing up guests located on Cluster Shared Volume (CSV) disks. While the backup is in progress, the CSV will remain in re-directed access mode for the duration of the guest backup.  This means all disk I/O for Guests located on that CSV will be in redirected mode and be going over the network instead of direct access to the CSV and performance may be affected. 

The following will describe how to properly configure the cluster networks so redirected I/O goes thru a dedicated network for CSV traffic and not over a normal client access or the cluster heartbeat network.
For an example, if your cluster has the following 3 networks:
iSCSI Storage Network
Heartbeat Cluster
Host Access Cluster

The iSCSI storage network is going out to the ISCSI SAN and are disabled for Cluster use.  This means that we have only two networks available for Cluster.  For a non-HyperV Cluster, this would be fine, however when you start Clustering Virtual Machines, you will need at least one additional network.  The reason for this is that when the CSV drives go into Redirected Mode, all disk I/O is sent over the network designated for CSV to the one node identified as the coordinator.  With DPM doing the backups, the coordinater node would be the one that is hosting the VM during the backup.  You also have a network that is designated as the Live Migration Network for when doing a live migration from one node to another.  These networks will get flooded with I/O so you want to keep them from the primary heartbeat network or the network that clients access the VMs on.
 Requirements for Using Cluster Shared Volumes in a Failover Cluster in Windows Server 2008 R2
 http://technet.microsoft.com/en-us/library/ff182358(WS.10).aspx

If you are using Cluster Shared Volumes, your network configuration must meet the following requirements:

Network adapters: We recommend that you install enough network adapters in each node so that one network is available for CSV while other networks are available for other purposes. For more information, see Understanding redirected I/O mode in CSV communication later in this topic. We recommend that you do not use the same network adapter for CSV communication as you use for virtual machine access and management.
So now that we know we have two network cards to use, we can take a look at the metrics to see which is the CSV network and which is the Live Migration Network.  You can check this with Powershell to get this output.

PS C:\Windows\system32> Import-Module failoverclusters
PS C:\Windows\system32> Get-ClusterNetwork | FT Name, Metric, Role
Name                             Metric     Role
----------------------------------------------
iSCSI Storage Network    10100       0
Heartbeat Cluster            1000        1
Host Access Cluster         10000      3
So here we verify the iSCSI storage Network is disabled for Cluster (Role=0), the Heartbeat Cluster is set for Cluster Communications Only (Role=1), and the Host Access Cluster is allowing Client Access (Role=3).  Any network that has a default gateway defined is considered a public network and has a value of 10,000 or more. Any network without a default gateway is considered a private interconnect and given a value between 1,000 and 10,000. The lower the number, representing a “cost”, the more likely it will be used for private cluster communications including redirected I/O of Cluster Shared Volumes.
The order of networks for this setting is initially configured based on the Cluster Network Metric property. To avoid both CSV and Live Migration from using the same network, the network with the lowest Metric (highest priority) will be automatically placed at the bottom of the list and the network with then next lowest Metric will be placed at the top of the list. So in looking at what the output was above, the CSV traffic would be going over the Heartbeat Cluster network and the Live Migration Network would be the Host Access Cluster network.  This is non-optimal and may effect client access to the running guests.
What you would need to do is add at least one additional network that can be used for the CSV redirected mode. Once you add the additional network, you should change the metrics so that you can configure which is used for CSV traffic. 
For example, say you added a new network with no default gateway and called it CSV-LM Cluster.  Cluster will automatically assign it a metric.  Let's say it now looks like this.

PS C:\Windows\system32> Get-ClusterNetwork | FT Name, Metric, Role
Name                              Metric     Role
-----------------------------------------------
iSCSI Storage Network      10100      0
Heartbeat Cluster               1000      1
Host Access Cluster           10000      3
CSV-LM Cluster                  1100      1

For information on how to change the metric on the CSV-LM Cluster network see the following article
Designating a Preferred Network for Cluster Shared Volumes ...
http://technet.microsoft.com/en-us/library/ff182335(WS.10).aspx
 

EXAMPLES:
What you would want to do in order to change the metric on the CSV-LM Cluster network would be to run this Powershell command.
PS C:\Windows\system32> Get-ClusterNetwork "CSV-LM Cluster" | %{$_.Metric=800}

If you added two networks (one for CSV and one for Live Migration), you run these two commands:

PS C:\Windows\system32> Get-ClusterNetwork "CSV Cluster" | %{$_.Metric=800}
PS C:\Windows\system32> Get-ClusterNetwork "LM Cluster" | %{$_.Metric=900}

You should manually set it to lower than 1000 so that it does not have a chance to conflict with anything else new that might come in later.  Now when you run the command to see what it configured, you see this:

PS C:\Windows\system32> Get-ClusterNetwork | FT Name, Metric, Role
Name                            Metric     Role
---------------------------------------------
iSCSI Storage Network    10100      0
Heartbeat Cluster             1000      1
Host Access Cluster        10000      3
CSV Cluster                      800      1
LM Cluster                        900      1

So based off of the Metric, the CSV Traffic will now be set for the CSV Cluster network because it is the lowest matric.
The other change you would want to do is change the network that is set for the Live Migration network.  To do this, go into Failover Management.  Highlight any guest, then under the summary section, right-click the guest and select properties of that VM.  It does not matter which guest as this is a single global setting.  Once you get into the properties, go to the Live Migration Network tab.  In here, uncheck the Host Cluster Access network and select the CSV-LM Cluster (if this was only network added) or the LM Cluster (if you added two networks for this).  This can all be done on the fly and is an immediate change.  There would be no reboots or restarts necessary.

Once this is all done, you would then test by placing all CSV disks into redirected mode to see how the network handles the redirected I/O traffic. You do this in Windows 2008 R2 failover manager. Under the Cluster Shared Volumes node, under the summary page, right-click each Cluster disk, select more actions, then “Turn on redirected access for this cluster shared volume”. You can monitor I/O using task manager, performance, networking tab.
Keep in mind that when the drives are in redirected mode, the clients may get a little sluggish on the VMs but it should not be to the extent that a node is removed from the Cluster due to loss of network connectivity, or clients cannot connect to the running guests in redirected mode. Once you confirm redirected mode works, turn off redirected mode for all csv disks and start taking DPM backups and see if you run into any issues. 
Before backups can be taken using the VSS provider, you must also enable serialization using the following article.
Considerations for Backing Up Virtual Machines on CSV with the System VSS Provider
http://technet.microsoft.com/en-us/library/ff634192.aspx

Collapse imageMore Information

You can also configure DPM 2010 to use a dedicated backup network to isolate backup data from going over the client access network.  For more information see the following:
Using Backup Network Address : http://technet.microsoft.com/en-us/library/cc964298.aspx
From the AskCore blog: http://blogs.technet.com/askcore/archive/2009/03/26/so-you-want-to-try-a-backup-network.aspx
The following links contain useful information about DPM 2010 protection of Hyper-V Guests.
Managing Hyper-V Computers
http://technet.microsoft.com/en-us/library/ff399446.aspx
Understanding Protection for CSV
http://technet.microsoft.com/en-us/library/ff634189.aspx
The following links contain useful information on Cluster Shared Volumes and network considerations.
Requirements for Using Cluster Shared Volumes in a Failover Cluster in Windows Server 2008 R2
http://technet.microsoft.com/en-us/library/ff182358(WS.10).aspx
 Hyper-V: Using Live Migration with Cluster Shared Volumes in Windows Server 2008 R2
http://technet.microsoft.com/en-us/library/dd446679(WS.10).aspx

No comments:

Post a Comment