How to Automate Uptime Display in StatusPage.io via Synthetic Monitors in Datadog in 4 steps

I was given the task of converting a 1/0 metric on our Statuspage.io page from Datadog Metrics using the Route53 healthcheck to an actual percentage uptime display in StatusPage.io, or at least something similarly meaningful to the end user.

First stop: Service Level Objectives

When browsing around our current monitors and dashboards, one thing that stood out was “service level objectives.” In combination with synthetics, they provide an uptime percentage over a period of time that can be embedded on the dashboard. [We’ll come back to synthetics on a different approach]

SLO Synthetics Uptime Display in Datadog

Next stop: Trying to embed those SLOs

The System Metrics integration on the statuspage.io side seems to really only be built for flat queries for a point-in-time, and not aggregated over a period of time of days or weeks. A aws.route53.health_check_status query that produced either a 1 or a 0 at any given point in time was fine, but coming up with a way to “query” for a 24 hour or 90 day up time was a different story (impossible to do via direct integration between the two apps?)

Third stop: UptimeRobot and Similar

Jyll over @ Veracity.net suggested some experimentation with Uptime Robot and similar services with my own free instance of StatusPage, and it was in stripping away the extra configuration and being able to feed a simple up/down email or webhook to statuspage.io that I came back to the idea of looking to see if I could email or webhook synthetic alerts from Datadog to Statuspage. (Spoiler: You can!)

Final stop (and the actual steps needed!) Automating Datadog to Send Status to get Uptime Display in StatusPage.io

  1. Add a component in your statuspage.io account
  2. Click on the “Automation” button to get the automation email. Copy that email:
uptime display in statuspage.io
Click the Automation button to reveal your automation email

3. (Create a synthetic monitor that checks a heartbeat route if you don’t already have one)

4. Go to your synthetic monitor in Datadog… under Step 6 is “Notify your team”. Your monitor name needs to use the template variables {{#is_alert}}DOWN{{/is_alert}}{{#is_recovery}}UP{{/is_recovery}} for statuspage automation to understand the message. The rest of the monitor name is irrelevant (as long as DOWN or UP isn’t a fixed part of that name!)

The automation email needs to be mentioned in the message body with an @ in front of it.

Monitor alert settings
No, that’s not a valid automation email.

Monitoring your S3 buckets for #omgcat usage

If you’re serving up a static website from S3, especially if you have larger assets stored there, you may want to put monitoring on the requests or bytes downloaded from S3, just to make sure someone’s not running up terabytes of transfers or millions of requests.

Enabling Metrics on S3

You will have to enable metrics on S3 in order to get CloudWatch alarms on them.

  • Go to Services -> S3 -> Buckets and select the bucket for your static site.
  • Select Management tab and [Metrics] and then click on the pencil icon next to the bucket icon.
  • Once enabled, the metrics will take a bit to populate.

Setting up Simple Notification Service (SNS)

(These are the same instructions as in Monitoring your CloudFront #omgcat usage)

Set up a topic

  • First, you’re going to want to be notified. Go to Services -> Simple Notification Service to set up a pathway for that to happen.
  • Next, click “Topics” and then [Create topic]
  • Name your topic something that adequately describes the purpose (I just used domainname-com)
  • Scroll down to [Create Topic]

Set up a subcription

  • Under “Amazon SNS” left sidebar, click “Subscriptions” and [Create Subscription]
  • Click on the Topic ARN field and you should be able to see an ARN with your topic name as the last part of the ARN. Click that ARN
  • Under Protocol, select your preferred method of notification (I’m going with SMS.
  • Under Endpoint, enter your cell number, including country code (+18005551212 for (800) 555-1212 in the US)

Setting up a CloudWatch Alarm

  • Go to Services -> CloudWatch -> Alarms and [Create alarm]
  • [Select metric] and select S3
  • If you don’t see “Request Metrics per Filter” then the metrics haven’t started populating yet.
  • Check “GetRequests” or “BytesDownloaded” and [Select Metric]
  • Set conditions as you would like to have flag any anomalies and click [Next]
  • Choose “In Alarm” and “Select an existing SNS topic” and click in the box below “Send Notification To…” to get suggestions and select the SNS topic corresponding to the notification method you set up. Click [Next]
  • Name your alarm and click [Next]
  • Review the summary and click [Create Alarm]

Monitoring your CloudFront #omgcat usage

Ok, so you’ve created a lovely static site and/or set up a CloudFront distribution for https for it. But CloudFront bills by the GB. What if somebody decides your assets are perfect to hotlink to or just straight up makes an insane boatload of requests? How do you protect yourself from getting a frightening bill before AWS Budget can even notify you?

Setting up Simple Notification Service (SNS)

Set up a topic

  • First, you’re going to want to be notified. Go to Services -> Simple Notification Service to set up a pathway for that to happen.
  • Next, click “Topics” and then [Create topic]
  • Name your topic something that adequately describes the purpose (I just used domainname-com)
  • Scroll down to [Create Topic]

Set up a subcription

  • Under “Amazon SNS” left sidebar, click “Subscriptions” and [Create Subscription]
  • Click on the Topic ARN field and you should be able to see an ARN with your topic name as the last part of the ARN. Click that ARN
  • Under Protocol, select your preferred method of notification (I’m going with SMS.
  • Under Endpoint, enter your cell number, including country code (+18005551212 for (800) 555-1212 in the US)

Get Notified

  • Go to Services -> CloudFront -> Alarms
  • [Create Alarm]
  • Under Metric, choose the threshold that you want to detect on… maybe it’s Requests, maybe it’s Bytes Downloaded…
  • Select the distribution that you want to watch ( domainname.com should be mentioned in the dropdown)
  • For “Send a notification to”, select the SNS topic that corresponds to the notification method you set up.
  • Since mine is a dev/test site, I don’t expect more than a request/second so
  • Finally, [Create Alarm]

Testing the Alarm

  • If you have a low enough threshold you can probably just hold down F5 (or whatever your refresh key is) for a few seconds. (Word of caution: Don’t do this with a page that downloads a lot of assets!)
  • In bash you can also do the following.
for i in {1..61}
do
  curl https://domainname.com
done
  • If your notifications are working, you should get an message through your preferred notification method.
  • Under Services -> CloudWatch -> Alarms, you should also see your Alarm count be > 0.