Quantcast
Viewing all articles
Browse latest Browse all 19214

Exchange 2010 SP2 mailbox server not responding after 20 RPC's not processed for 1 min

Hello,

I am in the progress of researching an Exchange outtage that occured.

We are  using 2 CAS/HT servers and 2 other server installed as a 2 node DAG. All is Exchange 2010 SP2RU3

All of a sudden all users could not acces their mailboxes, the CAS Servers were rebooted and all was ok again.

After researching, I found an error on de MBX DAG server:

eventID 10025, MSExchangeIS:

There are 20 RPC requests that have taken an abnormally long time to complete. This may be indicative of performance issues with your server.

I have found some info:

http://blogs.technet.com/b/exchange/archive/2011/03/21/information-store-timeout-detection-in-exchange-2007-sp3.aspx

and

http://technet.microsoft.com/en-us/library/ff477616

Also this one is helpful: http://johanveldhuis.nl/?tag=exchange-2010&lang=en

So here's what I think that happenend:

At some moment there were 20 RPC's not making progress. This caused the affected mailserver to stop working.

The reboot of the CAS servers solved the issue, because the RPC's stopped.

Is my theory correct? The links only mention a mailbox Quarantine when a treshold of 5 is reached for a particular mailbox. In this case the mailbox of the offending user will be quarantined, meaning not accessable.

Is doesn't explicitly mention what happens when the treshold of 20 is reached for a Server (this happend to my DAG member)

Anyone



Viewing all articles
Browse latest Browse all 19214

Trending Articles