Website Content (Keyword) Monitoring

15 Mar 2019

...new feature!

We've just rolled out another major new feature - website content monitoring. If the content of your website changes unexpectedly, Downtime Monkey will alert you.

website content monitoring - hacked websites


How Content Monitoring Works

A request is sent to the webpage and the source code of the webpage is checked to make sure that it contains a keyword phrase that you have supplied. If the phrase isn't found then the site will be recorded as "down: keywords not found" and alerts will be sent.

This works well as a complement to standard downtime monitoring, catching some website problems that would otherwise be missed...



Catch On-Page Errors

Most of the time when a website goes down the result is an http error. For example, "404 Page Not Found" or "503 Service Unavailable". These errors are caught by Downtime Monkey's downtime monitoring - no problem!

However, sometimes when a site goes down an http error is not produced. Instead the webpage is served with a normal response of "200 OK" and the error is printed on the otherwise blank page.

Database Problems

When a database problem occurs this usually produces an on-page error - an example is when a database is overloaded and the error is shown: "SQL Error: Too many connections[1040]". This is common on busy forums:

website content monitoring - database errors


Memory Limits Exceeded

When a script's memory limits are exceeded this also produces an error on-page - this is common in heavy content management systems such as WordPress: "Fatal error: Allowed memory size of xxxxx bytes exhausted":

website content monitoring - memory limits


These are just a couple of examples from a multitude of possible on-page errors - content monitoring will catch them all!

Catch Hacked Websites

It's a sad state of affairs but attacks on websites are now an everyday occurrence. Attackers regularly take over sites and replace the page content with their own, often malicious, content.

Content monitoring can catch this and send you an alert, so you'll get notified and can restore your website right away.

website content monitoring - hacked


Getting Started With Content Monitoring

It's really easy to set up content monitoring for your site - get started in less than a minute:

1) Login to your Downtime Monkey account (content monitoring is a Pro feature so you'll need to be on a Pro Plan).

2) Add a new monitor (skip this step if you want to add content monitoring to an existing monitor).

3) Go to your monitors, scroll to the monitor of your choice and click the monitor settings icon.

4) Select "On" from the "Keyword monitoring on/off" dropdown menu.

5) In the "Keywords" field, input an exact match phrase from your webpage.

5) Click update monitor - that's all!

Minimising Load To Your Server

One of the aspects that we spent a lot of time on when developing the feature was minimising the load to your server.

Unlike downtime monitoring, content monitoring requires that the webpage content is downloaded each time a check is made. This uses bandwidth and when considering the frequency of the checks it was important to take some steps to minimise this load:

1) Websites are monitored every 3 minutes (as opposed to every minute for downtime monitoring). This reduction in monitoring frequency reduces bandwidth by 1/3.

2) Only the source code (i.e. the text) of the website is downloaded. Images, video and other heavy content is ignored.

3) Where the website server permits, only the first 5KB of the page is accessed. However, not all servers are configured to allow partial page loads but where they do we take advantage of the savings in bandwidth.

4) When the website server doesn't permit partial page loads the page size is limited to a maximum of 50KB. This was a big decision for us as it means that some very heavy webpages won't be able to use this feature. However, we were also aware that we needed to place a limit as some sites have huge pages (over a MB of code) and owners of these sites probably won't want the load on their server to go through the roof. The limit of 50KB corresponds to roughly 50,000 characters of source code and when we surveyed a bunch of websites we found that 96% of sites either used less than this or allowed partial page loads.

Alerts

Alert settings for content monitoring are the same as the settings that are in place for downtime monitoring. If you already have alerts set to email, SMS or Slack then no need to change anything - you'll receive these alerts for both downtime and content monitoring.

Alerts are of the form: "URL is down: keywords don't match" or "URL is up: keywords match" so that you can tell right away that the alert is for content monitoring.

Rate limits that you have set for SMS and Slack alerts also apply to content monitoring - if you have a rate limit set the total number of alerts per hour won't exceed this.

Custom alert delays (e.g. only send alerts if a site is down for 2 minutes) don't apply to content monitoring. Alerts are sent instantly when a content mismatch is found - this is because content mismatches are an indication of a serious problem that usually doesn't self heal, so you'll get notified ASAP.

Content Monitoring Logs

All content monitoring records are logged and can be viewed on a monitor's stats page, along with the start time, end time, duration and explanation for the event.

Content monitoring events aren't included in the overall uptime stats for the site as content monitoring is treated separately from downtime monitoring.

A Short Period Of Beta

This feature is now live and will undergo a short period of Beta testing over the next few weeks.

If you have any questions check out the Content Monitoring FAQ.

Finally, a big thank you to all the people who submitted feature requests for this - you have really helped us to improve Downtime Monkey!