Exchange Architecture:
EXCHANGECASHUB1: Roles: CAS/HUB
EXCHANGECASHUB2: Roles: CAS/HUB
EXCHANGEMAILBOX1: Roles: Mailbox
EXCHANGEMAILBOX2: Roles: Mailbox
All 4 servers are Exchange 2010 SP2 Enterprise running on Windows Server 2008R2 Enterprise.
All 4 servers are VM's on VMware VSphere 5.0
I have 2 network interfaces on the Mailbox servers. 1 for Mapi and 2 for Replication
Both Mailbox Servers are part of a DAG. All mailbox databases are active on one Mailbox server at a time and are replicated to the other mailbox server via the DAG.
I use 2 - VMXNET3 Nics on both Mailbox servers and all of the mailbox database disks use the VMware Paravirtual SCSI adapter. The OS disk uses the LSI Logic SAS SCSI adapter.
Issue:
Whenever I upgrade the VMware tools on one of my Exchange mailbox servers, all active mailbox databases dismount. To resolve, both nodes need to be rebooted at least once. After the reboots, the databases mount back up on the active server and we are back to normal.
Prior to upgrading VMware Tools (or performing any maintenance), I suspend Database Copy to prepare for maintenance. Additionally, I've ranSet-MailboxServer -DatabaseCopyAutoActivationPolicy:Blocked. If I'm preparing to do maintenance on the Active cluster node, I'll runCluster group "cluster group" /move
This problem will occur on either DAG node when I upgrade the VMware tools and can be reproduced without question. In an attempt to resolve this, I applied:http://support.microsoft.com/kb/2550886 but this did not resolve the issue. Other than this issue, the Exchange DAG operates beautifully. I can suspend Database Copy, block database copy auto activation and take the server down/offline, patch or do whatever I need to do and the DAG member hosting the active databases works flawlessy. Upgrade the VMware tools on a passive DAG member and all active mailbox databases dismount.
Additionally, prior to the first reboot of the passive node right after the VMware Tools are upgraded, I can see Windows Server 2008 R2 hanging on shutting down the Cluster Service. I know at this point that all of my active databases just dismounted and my Exchange environment is now on vactation. : ( Also, when the passive server comes back up (prior to restarting the active DAG member), the Cluster service will not start on the passive DAG member. I'm assuming this is the case on the supposed Active DAG member as well but haven't verified. Again, the resolution for this is to restart both servers at least once and then everything comes back to normal.
Anyone see this before?