Documentation - Healthchecks Meteopress

How Healthchecks Meteopress Works

Each check in My Checks page has a unique "ping" URL. Whenever you make a HTTP request to this URL, Healthchecks Meteopress records the request and updates the "Last Ping" value of the corresponding check.

When a certain, configurable amount of time passes since last received ping, the check is considered "late". Healthchecks Meteopress then waits for additional time (configured with the "Grace Time" parameter) and, if still no ping, sends you an alert.

As long as the monitored service sends pings on time, you receive no alerts. As soon as it fails to check in on time, you get notified. It is a simple idea.

Signalling a Success

At the end of your batch job, add a bit of code to request your ping URL.

HTTP and HTTPS protocols both work. Prefer HTTPS, but on old systems you may need to fall back to HTTP.
Request method can be GET, POST or HEAD
Both IPv4 and IPv6 work
For HTTP POST requests, you can include additional diagnostic information for your own reference in the request body. If the request body looks like a UTF-8 string, Healthchecks Meteopress will log the first 10 kilobytes of the request body, so you can inspect it later.

The response will have status code "200 OK" and response body will be a short and simple string "OK".

Signalling a Failure

Append /fail to a ping URL and use it to actively signal a failure. Requesting the /fail URL will immediately mark the check as "down". You can use this feature to minimize the delay from your monitored service failing to you getting a notification.

Below is a skeleton code example in Python which signals a failure when the work function returns an unexpected value or throws an exception:

import requests
URL = "https://healthchecks.meteopress.cz/ping/your-uuid-here"

def do_work():
    # Do your number crunching, backup dumping, newsletter sending work here.
    # Return a truthy value on success.
    # Return a falsy value or throw an exception on failure.
    return True

success = False
try:
    success = do_work()
finally:
    # On success, requests https://healthchecks.meteopress.cz/ping/your-uuid-here
    # On failure, requests https://healthchecks.meteopress.cz/ping/your-uuid-here/fail
    requests.get(URL if success else URL + "/fail")

Measuring Job Execution Time

Append /start to a ping URL and use it to signal when a job starts. After receiving a start signal, Healthchecks Meteopress will show the check as "Started". It will store the "start" events and display the job execution times. The job execution times are calculated as the time gaps between adjacent "start" and "complete" events.

Signalling a start kicks off a separate timer: the job now must signal a success within its configured "Grace Time", or it will get marked as "down".

Below is a code example in Python:

import requests
URL = "https://healthchecks.meteopress.cz/ping/your-uuid-here"


# "/start" kicks off a timer: if the job takes longer than
# the configured grace time, the check will be marked as "down"
try:
    requests.get(URL + "/start", timeout=5)
except requests.exceptions.RequestException:
    # If the network request fails for any reason, we don't want
    # it to prevent the main job from running
    pass


# TODO: run the job here
fib = lambda n: n if n < 2 else fib(n - 1) + fib(n - 2)
print("F(42) = %d" % fib(42))

# Signal success:
requests.get(URL)

Examples

Jump to example: Crontab, Bash, Python, Ruby, Node, PHP, C#, Browser, PowerShell, Email.

Crontab

When using cron, probably the easiest is to append a curl or wget call after your command. The scheduled time comes, and your command runs. If it completes successfully (exit code 0), curl or wget runs a HTTP GET call to the ping URL.

# m h dom mon dow command
  8 6 *   *   *   /home/user/backup.sh && curl -fsS --retry 3 https://healthchecks.meteopress.cz/ping/your-uuid-here > /dev/null

With this simple modification, you monitor several failure scenarios:

The whole machine has stopped working (power outage, janitor stumbles on wires, VPS provider problems, etc.)
cron daemon is not running, or has invalid configuration
cron does start your task, but the task exits with non-zero exit code

Either way, when your task doesn't finish successfully, you will soon know about it.

The extra options to curl are meant to suppress any output, unless it hits an error. This is to prevent cron from sending an email every time the task runs. Feel free to adjust the curl options to your liking.

&&	Run curl only if `/home/user/backup.sh` succeeds
-f, --fail	Makes curl treat non-200 responses as errors
-s, --silent	Silent or quiet mode. Don't show progress meter or error messages.
-S, --show-error	When used with -s it makes curl show error message if it fails.
--retry <num>	If a transient error is returned when curl tries to perform a transfer, it will retry this number of times before giving up. Setting the number to 0 makes curl do no retries (which is the default). Transient error means either: a timeout, an FTP 4xx response code or an HTTP 5xx response code.
> /dev/null	Redirect curl's stdout to /dev/null (error messages go to stderr,)

Bash or a shell script

Both curl and wget examples accomplish the same thing: they fire off a HTTP GET method.

If using curl, make sure it is installed on your target system. Ubuntu, for example, does not have curl installed out of the box.

# using curl:
# (make sure it is installed on your system!)
curl --retry 3 https://healthchecks.meteopress.cz/ping/your-uuid-here

# using wget:
wget https://healthchecks.meteopress.cz/ping/your-uuid-here -O /dev/null

Python

If you are already using the requests library, it's convenient to also use it here:

# using requests:
import requests
requests.get("https://healthchecks.meteopress.cz/ping/your-uuid-here")

Otherwise, you can use the urllib standard module.

# urllib with python 3.x:
import urllib.request
urllib.request.urlopen("https://healthchecks.meteopress.cz/ping/your-uuid-here")

# urllib with python 2.x:
import urllib
urllib.urlopen("https://healthchecks.meteopress.cz/ping/your-uuid-here")

You can include additional diagnostic information in the in the request body (for POST requests), or in the "User-Agent" request header:

# Passing diagnostic information in the POST body:
import requests
requests.post("https://healthchecks.meteopress.cz/ping/your-uuid-here", data="temperature=-7")

# Passing diagnostic information in the User-Agent header:
import requests
requests.get("https://healthchecks.meteopress.cz/ping/your-uuid-here", headers={"User-Agent": "temperature=-7"})

Ruby

require 'net/http'
require 'uri'

Net::HTTP.get(URI.parse('https://healthchecks.meteopress.cz/ping/your-uuid-here'))

Node

var https = require('https');
https.get("https://healthchecks.meteopress.cz/ping/your-uuid-here");

PHP

file_get_contents('https://healthchecks.meteopress.cz/ping/your-uuid-here');

C#

using (var client = new System.Net.WebClient())
{
       client.DownloadString("https://healthchecks.meteopress.cz/ping/your-uuid-here");
}

Browser

Healthchecks Meteopress includes Access-Control-Allow-Origin:* CORS header in its ping responses, so cross-domain AJAX requests should work.

// the server returns appropriate CORS headers so cross-domain AJAX requests should work:
var xhr = new XMLHttpRequest();
xhr.open('GET', 'https://healthchecks.meteopress.cz/ping/your-uuid-here', true);
xhr.send(null);

PowerShell

You can use PowerShell and Windows Task Scheduler to automate various tasks on a Windows system. From within a PowerShell script it is also easy to ping Healthchecks Meteopress.

Here is a simple PowerShell script that pings Healthchecks Meteopress. When scheduled to run with Task Scheduler, it will essentially just send regular "I'm alive" messages. You can of course extend it to do more things.

# inside a PowerShell script:
Invoke-RestMethod https://healthchecks.meteopress.cz/ping/your-uuid-here

Save the above to e.g. C:\Scripts\healthchecks.ps1. Then use the following command in a Scheduled Task to run the script:

powershell.exe -ExecutionPolicy bypass -File C:\Scripts\healthchecks.ps1

In simple cases, you can also pass the script to PowerShell directly, using the "-command" argument:

# Without an underlying script, passing the command to PowerShell directly:
powershell.exe -command &{Invoke-RestMethod https://healthchecks.meteopress.cz/ping/your-uuid-here}

Email

As an alternative to HTTP/HTTPS requests, you can "ping" this check by sending an email message to your-uuid-here@localhost

This is useful for end-to-end testing weekly email delivery.

An example scenario: you have a cron job which runs weekly and sends weekly email reports to a list of e-mail addresses. You have already set up a check to get alerted when your cron job fails to run. But what you ultimately want to check is your emails get sent and get delivered.

The solution: set up another check, and add its @localhost address to your list of recipient email addresses. Set its Period to 1 week. As long as your weekly email script runs correctly, the check will be regularly pinged and will stay up.

When Alerts Are Sent

Each check has a configurable Period parameter, with the default value of one day. For periodic tasks, this is the expected time gap between two runs.

Additionally, each check has a Grace parameter, with default value of one hour. You can use this parameter to account for run time variance of tasks. For example, if a backup task completes in 50 seconds one day, and completes in 60 seconds the following day, you might not want to get alerted because the backups are 10 seconds late.

Each check can be in one of the following states:

	New. A check that has been created, but has not received any pings yet.
	Monitoring Paused. You can resume monitoring of a paused check by pinging it.
	Started. The check has received a "start" signal, and is currently running.
	Up. Time since last ping has not exceeded Period.
	Late. Time since last ping has exceeded Period, but has not yet exceeded Period + Grace.
	Down. The check has not received a "success" ping in time, or it has received an explicit "fail" signal. When a check goes into the "Down" state, Healthchecks Meteopress sends you an alert.