One of my customers wants to know how to leverage Exchange 2010 to provide high-availability (server failure) and disaster recovery (site failure) using the minimum number of servers. Here is a walk-through of the reference design and server fail-over experience:
- DC (FSW)
- Hardware Load Balancer (VIP for CAS Array)
- EX2010-1 (CAS/HTS/MBX Roles)
- EX2010-2 (CAS/HTS/MBX Roles)
- DC-DR (Alternate FSW)
- EX2010-3 (CAS/HTS/MBX Roles)
Configuring High Availability with two Exchange 2010 Servers
I am going to assume that you are already familiar with the process of installing Exchange, creating a DAG, and creating a CAS Array - so here is an overview of the configuration:
All three servers are added to my DAG and I set the Domain Controller as the File Share Witness (note: since there are three servers in my DAG, it will use a Node Majority under normal circumstances).
Next I configured my database to replicate to all of the members of my DAG.
Next I created a Client Access Array in the Exchange Management Shell and assigned it to my database.
Next I created a VIP on my hardware load balancer. I used a Barracuda 340 - but really any HLB should be fine.
Next, I created DNS records for the VIP on my hardware load balancer. I used two addresses: internal.test.local and external.test.local
Finally I configured the InternalURL and ExternalURL on my Exchange Virtual Directories to point to my VIP.
What happens during a Server Failure
At this point I now have high availability within my production site that can tolerate the failure of either EX2010-1 or EX2010-2.
At this point, DB1 is mounted on EX2010-1. When I look at my Connection Status in Outlook, it shows that I am connected to the VIP (in this instance, I am actually connected to EX2010-1 via the load balancer).
If I decide to do a graceful fail-over my database to EX2010-2, my Outlook Clients will receive a notification that they will need to restart Outlook. Note that even after the fail-over I am still using EX2010-1 as my RPC Client Access Server via my hardware load balancer.
If I decide to do a fail-over of my RPC Client Access Server from EX2010-1 to EX2010-2 (via marking EX2010-1 down on my hardware load balancer), my Outlook client will briefly lose connection before it is able to successfully reconnect.
In the event that I had a non-graceful server failure, my Outlook client would briefly lose connection before reconnecting (and possibly prompting my to restart Outlook).