What is an Acceptable Amount of Downtime?
Last updated on Sep 11, 2015
In the long-term achieving zero downtime is considered a myth. In order to remain in a healthy environment servers will need to go through maintenance, as it’s also a way to avoid drastic downtimes in the future. Businesses and web hosting companies need to be able to schedule maintenance to avoid and minimize any unplanned downtime or outages. In the grand scheme of things, even the largest organizations in the world are sometimes hit with outages, server issues, and maintenance that hinder their ability to achieve zero downtime. Google and Bing/Yahoo, the largest search engines have experienced downtime in the past, equating to several hundred thousand dollars in lost potential revenue.
Some factors to consider when thinking about the causes of downtime are component failures, network issues, errors made by server administrators, or attacks by malicious users. With options such as high availability server clustering, security management, DDoS protection and mitigation, controlled backups, etc. – companies can decrease the likelihood of downtimes and increase the probability of achieving 99.999% uptime.
Scheduled Maintenance: Most scheduled maintenance shouldn’t result in long downtimes and scheduled means you should be notified in advance through email if it is anything that may result in downtime or any major changes to your account.
Unexpected Server Failures: It will happen to even the best web hosting provider out there. What matters is how they respond to the crisis. How fast they get things fixed and how they handle planning on preventing such failures from happening again in the future.
How much is downtime costing you?
For multi-billion dollar organizations such as Yahoo or Google, even ten minutes can mean significant losses in revenue. For Google, their cost to mitigate and lessen the impact of downtimes will be outweighed by the revenue they’ll keep by not experiencing downtime. However, for a small to medium sized business, is ten minutes going to equate to large losses in potential revenue? Will the cost of trying to avoid downtime be outweighed by the revenue you are not losing?
Uptime Monitoring
Whether you have shared hosting, reseller hosting, VPS hosting or dedicated server hosting – downtime is something that needs to be considered. Without setting up an uptime monitor you won’t know how much downtime your website is really experiencing. There are plenty of uptime monitoring services out there and it’s really easy to setup. Here are a couple that I use and recommend trying that have free and paid plans: Uptime Doctor and Uptime Robot.
So, what is an appropriate amount of downtime?
There is no clear cut answer to this question. Most web hosting companies offer a 99.9% uptime guarantee and it is a standard that web hosting customers are used to seeing. While you may see other companies go above this with 99.99% or below with a 99% uptime guarantee – you may be wondering, how much downtime does this equate to?
- 99% ≈ 7.3 hours of downtime per month
- 99.9% ≈ 43 minutes of downtime per month
- 99.95% ≈ 22 minutes of downtime per month
- 99.99% ≈ 4.5 minutes of downtime per month
Keep in mind that an uptime guarantee doesn’t really tell you how much uptime you can expect from a web hosting provider. An “uptime guarantee” does not guarantee that your website will be up for X percentage of time per month, all it does guarantee is that if you experience uptime less than X percentage you will be compensated in some way. How you are compensated and what qualifies for the uptime guarantee varies from host to host so check their service level agreement and/or terms of service.
For example MonsterMegs has a 99.9% uptime guarantee where they provide a pro-rated credit for uptime below 99.9% in a month and downtime resulting from scheduled maintenance doesn’t apply (the guarantee only applies to unscheduled outages). Here’s a screenshot of how they calculate the credits.
Pop Quiz, Hotshot…
What is an acceptable amount of downtime?
A.) On average 5 minutes or less a month?
B.) On average 25 minutes or less a month?
C.) On average 45 minutes or less a month?
D.) On average 90 minutes or less a month?
Of course A is ideal, but for me so long as everything else about my hosting account is golden I’m fine with B for most of my websites. If there is a rare month with a larger amount of downtime honest communication with customers about the cause of any downtime is key.
What do you consider to be an acceptable amount of downtime?
Beware of the uptime guarantee that’s actually meaningless. I used to be on Bluehost, and if they didn’t meet their uptime guarantee, you were entitled to close your account and get a prorated refund for the unused portion. Of course, you had that ability anyway, even if they DID meet their uptime guarantee, so the guarantee meant absolutely NOTHING.
I believe they finally dropped the charade, and no longer provide any uptime guarantee at all.
BlueHost is owned by EIG and they don’t care if you go down or what guarantee they promised you. So this dose not surprise me.
The “useless guarantee” was there long before EIG acquired Bluehost. I was actually kind of surprised to see it go away under EIG – making empty promises is typical of the EIG business model!
Arvixe provides a “SLA Credit” if you ask for it during the following month. So far I have had credits for every month since September 2015, using data from my free Pingdom account. But I must be in the minority because they don’t provide any monitoring of their own
They “guarantee” 99.99% but since they were taken over by EIG haven’t been able to provide 99% in some months
99.99% is a reasonable expectation and a target that we have hit 17 out of 18 months (1 month at 99.8%) over the past year and a half across all of our shared servers. It is not all that easy though, there are a few things that allow us to do this.
1. Monitoring and automation: All servers and all critical services are monitored and should a service meet what we have set as degraded or down. Than it is automatically started by our monitoring software. This is extremely effective and lowers are man-hours significantly, along with downtime.
2. Rebootless kernel updates VIA Kernel Care. The only time we require a reboot of the actual server is on a full kernel upgrade.
3. Bare Metal Backups: All of our shared servers have Bare Metal Backups taken daily and stored in the cloud. This allows us to recover from a catastrophic server failure by restoring the Bare Metal Backup onto a “Hot Spare” server. The fact that we keep our our overall storage space pretty small, and that it is SSD only only makes the recovery quicker.
If you are concerned about uptime than make sure you talk to your provider. Ask lot’s of questions and get the answers you require.