How To Reduce Website Downtime

19 Dec 2017

In an ideal world every site would have 100% uptime, 24/7, 365. However, the reality is not so perfect – hardware failures, DNS issues, DDoS attacks, server maintenance, software problems and poor hosting are among the many causes of downtime.

It’s not all doom and gloom though – by following a few practical steps you can really cut down on your downtime:

website downtime 503 error

Avoid poor hosting

Poor hosting is the most common cause of downtime - it is simple to rectify by using a better quality host and although it can be a hassle to move a website it's worth making the effort if your site suffers from regular downtime.

Traditional shared servers, although cheap are usually quite susceptible to downtime while private servers and cloud servers with self healing technology can maintain uptime round the clock if managed properly.

A good web host should guarantee uptime through an SLA (service level agreement). The percentages can be misleading though – a guarantee of 99% uptime may sound good at first but actually allows over 7 hours of downtime each month! Aim for at least a 99.99% uptime guarantee and find out what compensation is provided if the guarantee is broken.

If you manage your own servers then the quality of the hardware and the team that manages them will be paramount.

Monitor your website

Monitoring your website is a very important step towards reducing website downtime.

A good service will send an alert if your website goes down. Ideally, alerts will be customizable – if your business has support staff in place day and night then email alerts are a good solution. SMS alerts may be more useful for small businesses where text messages can be sent to an emergency phone number outside office hours.

It’s useful to be able to view the details of each downtime as well as uptime statistics so that the performance of your site can be reviewed. This will enable the cause of the downtime to be examined, addressed and fixed.

Website monitoring needn't cost the earth – you can get a free account and start monitoring your websites right away.

Take backups and test restores

Taking regular backups is something we all know is important – these should be automated so that they aren’t missed. It’s also good to test your backups and be familiar with the restore procedure so that if the worst happens you can restore your site quickly and with confidence.

Update CMSs with care

If your website uses a content management system it is very important to keep it up to date – keeping your CMS up to date with the latest version is one of the most important steps that can be taken to avoid leaving your site vulnerable to exploits.

However, another common cause of website downtime is automated updates of the CMS – incompatibilities of new versions with plugins and themes are known to be a problem and can bring a website down. So it’s best to schedule updates for off peak times and to be on hand when they take place... and always be ready to roll back to the last working version if there are problems.

Keep an eye on bandwidth

It’s a good idea to monitor the bandwidth that is used by visitors to your site. If an unusually large amount of bandwidth is being used it may be a spike in legitimate traffic but it may also be traffic from bad bots.

Comment spam bots are a common problem and although easily avoided by disabling or requiring moderation of comments, if they manage to post successfully they can inundate a site, slowing it to a halt. They often continue to bombard a site even after comments have been disabled.

DDoS attacks have become more prevalent and sophisticated in the last few years – large botnets have been used to bombard sites with traffic and have brought some high profile sites down.

Prevention is better than cure when dealing with both DDoS and spam bots. If a content delivery network (CDN) is a good fit for your site and is affordable, it is a tried and tested solution to both problems.

Have a plan

If your website goes down, having a plan of action in place will reduce the time it takes to get the site live again and is also likely to make life less stressful for those involved. Having clearly defined roles is important – know who will be alerted if the site goes down and have a checklist of actions that they need to take.