Tagged: cloud Toggle Comment Threads | Keyboard Shortcuts

  • ThomasPowell 8:26 am on October 25, 2021 Permalink | Reply
    Tags: , cloud, datadog, , statuspage, statuspage.io   

    How to Automate Uptime Display in StatusPage.io via Synthetic Monitors in Datadog in 4 steps 

    I was given the task of converting a 1/0 metric on our Statuspage.io page from Datadog Metrics using the Route53 healthcheck to an actual percentage uptime display in StatusPage.io, or at least something similarly meaningful to the end user.

    First stop: Service Level Objectives

    When browsing around our current monitors and dashboards, one thing that stood out was “service level objectives.” In combination with synthetics, they provide an uptime percentage over a period of time that can be embedded on the dashboard. [We’ll come back to synthetics on a different approach]

    SLO Synthetics Uptime Display in Datadog

    Next stop: Trying to embed those SLOs

    The System Metrics integration on the statuspage.io side seems to really only be built for flat queries for a point-in-time, and not aggregated over a period of time of days or weeks. A aws.route53.health_check_status query that produced either a 1 or a 0 at any given point in time was fine, but coming up with a way to “query” for a 24 hour or 90 day up time was a different story (impossible to do via direct integration between the two apps?)

    Third stop: UptimeRobot and Similar

    Jyll over @ Veracity.net suggested some experimentation with Uptime Robot and similar services with my own free instance of StatusPage, and it was in stripping away the extra configuration and being able to feed a simple up/down email or webhook to statuspage.io that I came back to the idea of looking to see if I could email or webhook synthetic alerts from Datadog to Statuspage. (Spoiler: You can!)

    Final stop (and the actual steps needed!) Automating Datadog to Send Status to get Uptime Display in StatusPage.io

    1. Add a component in your statuspage.io account
    2. Click on the “Automation” button to get the automation email. Copy that email:
    uptime display in statuspage.io
    Click the Automation button to reveal your automation email

    3. (Create a synthetic monitor that checks a heartbeat route if you don’t already have one)

    4. Go to your synthetic monitor in Datadog… under Step 6 is “Notify your team”. Your monitor name needs to use the template variables {{#is_alert}}DOWN{{/is_alert}}{{#is_recovery}}UP{{/is_recovery}} for statuspage automation to understand the message. The rest of the monitor name is irrelevant (as long as DOWN or UP isn’t a fixed part of that name!)

    The automation email needs to be mentioned in the message body with an @ in front of it.

    Monitor alert settings
    No, that’s not a valid automation email.

  • ThomasPowell 6:23 am on May 26, 2021 Permalink | Reply
    Tags: cloud, free,   

    Inauspicious Start for Oracle Cloud Free Tier Sign up 

    Oracle Cloud Free Tier offerings

    I heard via word of mouth and Twitter of a new Oracle Cloud Free Tier (with permanently free services [for now]). The always free services looked enticing enough:

    Oracle Cloud Free Tier offerings
    AMD and ARM and object storage!

    The challenge was, “Who can afford ‘free’ services?” Time is worth something. But I can always make use of another cloud server to run experiments on.

    Problem #1: Email confirmation didn’t go through

    Self-explanatory, but, yes… I checked my spam and all the auto-sorting tabs. The email confirmation link that’s only good for 30 minutes didn’t deliver in a timely manner. Second attempt, the email showed up immediately.

    Problem #2: Password Too Strong

    My first 30 character randomly generated password didn’t pass the test:

    I think I met the requirements??

    Problem #3: Wouldn’t validate my debit card

    Maybe there’s a payment glitch right now? Maybe I don’t have enough in the account for a “free” account? Worse… the “try again” link makes you start over from the very first step of creating your account.

    Problem #4: Declined my credit card

    After going through 2-3 times with a debit card, I tried with a credit card. Maybe I needed five figures of available credit for a free account? This is Oracle, after all.

    Upon resubmitting, I’m back to “Error processing transaction”

    Aha! Moment

    (b) in the above error message was the clue that eventually led me to the right answer… VPN (still US-based) was active, which possibly set off alarm bells with the payment processor. I’m in now and ready to try some VMs!

  • tech0x20 9:20 am on February 25, 2009 Permalink | Reply
    Tags: cloud,   

    If 5 nines is a myth, what is 3 nines? 

    Burned By Gmail Outage? Google Will (Almost) Buy You a Postage Stamp.  Apparently the SLA for Google Apps will get you 3 days if uptime dips below 99.9% for the month.  At $50 annually for the service, that’s $0.41.  (Google has decided to pay out 15 days credit, anyway.)

    Service credits from the SLA:

    Monthly Uptime Percentage Days of Service added to the end of the Service term, at no charge to Customer
    < 99.9% – ≥ 99.0% 3
    < 99.0% – ≥ 95.0% 7
    < 95.0% 15

    That’s $0.41, $0.96, and $1.91 credit for over 43.2 minutes, 7.2 hours, and 36 hours of downtime, respectively.  I realize that across an entire business, that could potentially be a thousand credits or more, but what business would see a $1.91 per user credit as adequate for 36 hours of downtime in a month?

    From these two register articles, Google’s email service goes down and Google blames Gmail outage on data centre collapse, it looks like the downtime was about 2 hours and 15 minutes, but there are some reports of outages as long as 4 hours.

    There is a nice article on High Availability on Wikipedia to compare uptimes on a weekly, monthly, and yearly basis.

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc
%d bloggers like this: