I'm running Exchange 2010 SP1 RU6. I've noticed that my replay queue gets up into the thousands for a few of my passive copy DBs. Yesterday MDB3 was over 5,000 by the end of the day. Over night, during non-production hours, the replay queue drops down to
0 and every morning at 7-8am we start the process over again.
Quick idea of environment: Exchange servers replicating DBs are as follows: EX1, EX2, EX3, EX4, EX5. EX1 & EX2 are old IBM servers with internal hardware storage + 1-2 iscsi luns. EX3, EX4, & EX5 are on a IBM S chassis with HS22 blades. All storage
is directly connected SAS storage for those blades. These last three servers are where the replay queue lengths are getting out of control (EX3, EX4, EX5).
Before anyone asks: I have unchecked the option to mount the DBs at startup.
I have been looking through this blog to see if it's some kind of a storage problem here but I'm not really coming up with anything other then some matching symptoms:
http://blogs.technet.com/b/mikelag/archive/2011/02/09/how-fragmentation-on-incorrectly-formatted-ntfs-volumes-affects-exchange.aspx
When I search the forums here about solutions I found one thread that didn't have any other updates on the issue:
http://social.technet.microsoft.com/Forums/en-US/exchange2010/thread/6f8c06a9-958e-4e89-911c-f5ce1db81ef8
I'm concerned about fail-over times with the replay queue so high. Also, potential data loss.
Here's what my performance monitor tells me:
[PS] C:\Windows\system32>Get-MailboxDatabaseCopyStatus
Creating a new session for implicit remoting of "Get-MailboxDatabaseCopyStatus" command...
Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex
Length Length State
---- ------ --------- ----------- -------------------- ------------
MDB2\EX5 Healthy 0 681 12/12/2012 10:06:09
AM Healthy
MDB4\EX5 Healthy 0 527 12/12/2012 10:06:08
AM Healthy
MDB1\EX5 Healthy 0 344 12/12/2012 10:06:11
AM Healthy
Service\EX5 Healthy 0 0 12/12/2012 8:00:56 AM Healthy
MDB3\EX5 Healthy 0 870 12/12/2012 10:06:08
AM Healthy
Mailbox Maintenance 0462345754\EX5 Mounted 0 0 Healthy
PowerUsers\EX5 Healthy 0 2 12/12/2012 10:06:08 AM Healthy
The queue is at 1k now on MDB3 and it's only 10am. This is 2 hours of production time. My theory is that this happens because the DBs are just large. But I don't know for sure that is why I am posting here. The PowerUsers DB, Service, and Maintenance are
not having this issue (they are much smaller then the others).
Any advice would be appreciated. Thanks.