Are you Prepared for Disaster? Business Continuity Planning for Email Outages
It happens to everyone who uses any email service: suddenly your email is no longer working. If it’s just for a few minutes or some scheduled time at night, it’s usually no big deal. However, if it’s in the middle of your work day and you rely on email, you may have a big problem.
What do you do if your email stays offline for 5 minutes … 10 minutes … an hour … and you don’t know when it is coming back?
But that won’t happen!
True, there is no guarantee that such an issue won’t happen, but it certainly can and it happens all the time to companies large and small with in-house and outsourced services of all kinds.
There is a lot that can be done to prevent predictable problems like hardware failure — e.g., redundancy, load balancing, etc. However, there is little that can be done to prevent human error. Yes, policies can be put in place, staff can be trained, etc. But it only takes one accidental misstep at the wrong time to initiate downtime immediately … or which creates a time bomb that surfaces unexpectedly in the future.
Additionally, external factors could affect your ability to use email. E.g.,
- Network or ISP failure causing temporary connectivity problems
- DNS or Registry failure causing inbound email to not be delivered or you to not be able to reach your email provider’s servers
- Denial of Service attacks on your email provider, your DNS, their DNS, your ISP, some network in between, etc.
- Malicious staff with administrative permissions shutting down accounts or deleting things
- External email filtering services having issues and causing email delivery problems
So, the possibility of down time exists no matter what. The questions are:
- How will the different kinds of downtime impact your business?
- How likely do you think that each kind of downtime will occur?
- What can you do now so that should any of these issues arise, you can still run your business until the issue is resolved?
- Can this be done in a cost effective manner?
Below we present a series of options for addressing many of these possible issues. There may be other solutions, as well as other issues that could arise; however, we will try to address some of the most important and cost effective solutions.
1. External Email Archival
Email archival solutions save copies of all inbound and outbound email:
- To an externally hosted service (not in your office or on your regular email servers)
- Where the email cannot be deleted or edited or lost
- Where the email is kept for a long period of time (e.g., 10 years or “forever”)
- Where you and/or your users can login, search for, and download copies of email messages sent and received any time the need arises.
There are a few key points to be aware of:
- The Archival system and access to it should not be at your office or in the same data center holding your regular email email — so that if either are down, you can still access your email archives.
- Email should not be deletable or editable so that you have reliable copies, e.g. for legal reasons
- Both you and your individual users should be able to login to view their email — so that it is not left to just 1 administrator to access email for everyone during a time of crisis. If everyone can login and view their own archived email, they are self sufficient and wok can get done.
LuxSci’s Premium Email Archival service, provided through our partner Sonian, meets all of these criteria.
In the case of disaster, Archival provides emergency access to all old email messages. This kind of system is required by HIPAA and other regulations.
2. Email Message Continuity Service
When you have an advanced inbound email filtering service (as most business do these days), your inbound email passes through special email filtering servers before being forwarded on to the servers where your email is actually stored.
With a Message Continuity service:
- These inbound email servers are located in a different data center from your email service provider
- They can auto-detect when your provider is offline or down (i.e,, they can’t deliver new email to it)
- In a case where your email provider is down, Message Continuity is automatically enabled (you can also enable it manually on demand).
- While Message Continuity is enabled:
- All inbound messages are queued/saved on the filtering servers, instead of being delivered
- Your users can login to a Message Continuity web portal to read these new messages and reply to them/send new email messages while their email provider is down.
- Once the issue is over, Message Continuity is disabled and:
- All the queued email messages are then delivered to your email provider’s servers
- Copies of your send email messages are also delivered there for your records.
Message Continuity services provide emergency access to new email messages and allow for sending of messages and replies in the case of a disaster.
LuxSci’s Premium Email Filtering service includes Message Continuity as an upgrade option and includes all of these features.
3. Backup Email Account
A backup email account is an account either at a different email provider or on a different server on your current email provider. Depending on the situation you could:
- Duplicate Email Account: Have copies of all inbound email messages go to both accounts, so you can switch to the backup account at any time. This works as long as the servers receiving your email and forwarding copies to the backup account are still online and as long as at least on one of the two accounts are online. See Spilt domain routing for a detailed description. Many of our customers, including LuxSci support itself, use this as a simple, inexpensive backup mechanism that protects against single server failure.
- DNS MX Record Change: You could have a “hot standby” account with another provider. In the case of emergency, you can change your DNS MX records to that provider and quickly get your new email there. The down sides of this are:
- You do not have copies of old email there and you have to manually make the switch.
- Messages ending up at the new provider may be hard to move back to the old provider after the situation is resolved. However, having this option available in conjunction with the “Duplicate Email Account” option, you are covered for scenarios of various severity.
4. Reliable DNS
We see many cases every week of people’s email not working due to DNS issues (read more about DNS). These are caused by:
- Slow DNS servers
- Attacks on DNS servers taking them offline
- Shoddy DNS service
Choosing a good DNS provider helps ensure that your email can be delivered even after the fact of denial of service attacks on DNS, and that the service will work as advertised. We highly recommend our DNS service, which we provide through our partner EasyDNS who uses Anycast for denial of service protection. EasyDNS is about as different from “Godaddy DNS” as you can get. We also recommend setting up redundant DNS services to protect yourself against DNS outages due to denial of server attacks or DNS service provider issues. See: DNS services that shrug off denial of service attacks.
5. Data Backups
You must be sure that your email provider makes backups of your email data so that mail can be restored on accidental deletion and so servers can be restored completely from backup in the case of a catastrophic server failure.
Additionally, backups should exist in 2 separate locations — on site and off site. On site backups provide fast recovery for recent data. Off site backups provide slower recovery for older data and protect in the case of a catastrophic issue with the main live infrastructure that affects both your mail servers and their on-site backups.
LuxSci provides daily on-site and weekly of-site backups for all accounts. Reasonable on-demand restores of deleted email folders from backups are also free for all accounts. Getting a dedicated server from LuxSci? Ask about custom backup schedules, custom retention periods, and server imaging as further backup options.
6. What servers are you using?
“My email is hosted in the cloud”.
That statement doesn’t address the reliability of your email service. This is a case of “you get what you pay for.” As you may know, “Cloud” just means “someone else’s computer.” Typically, inexpensive cloud servers at any provider are single (virtual) machines running on a single computer. If that hardware fails in any way (i.e. there is a “short,” the network cable breaks, the power supply dies, etc.) … then the server and your email are immediately down until the problem can be diagnosed and repaired. This can take 30 minutes to hours and hours and this kind of down time is exempted from most service level agreements as being an actual “problem.”
What can you do? You can choose an email service with servers that have resiliency against hardware failure. There are many ways to do this, but essentially if the hardware they are running on fails (which it can), your email server is automatically restarted on another server within seconds. After that … its business as usual for you while the service provider fixes the problem server.
These resilient servers (LuxSci calls them “Enterprise-class” servers) can be more expensive, but they are completely worth it when your email is business critical.
There are many other considerations that should be addressed when developing a full disaster recovery plan, such as a communication plan for staff, what to do when your office is offline, as well as testing and reviewing your plan periodically. If you have not thought about this in a while, we recommend setting aside time to do so soon … and start with an analysis of what is most important for your business so it keeps humming along in the face of problems.