Update: Improving the SSL certificate expiration alerts [breaking change]

We just updated our SSL certificate expiration alerting. This update gives you more control over where and when you want to receive these alerts.

Please note that your current SSL alerting settings under "Alert Settings" (either account-wide or specific ones to check/group) are deprecated and will stop working in 3 months on November 13th, 2020.

Find out more on our announcement post

New feature: Maintenance Windows

We just released our new Maintenance Windows feature! You can find it under the πŸ”§ icon on the side menu.

maintenance-windows-editor.png

Using maintenance windows you can now:

  • Schedule downtime for your checks. You can do this as a "one off" or on a repeating cycle.
  • Target specific checks or check groups based on tags

You can create as many maintenance windows as you like.

Read more in our docs

Update - GitHub API intermittent issues

We rolled out a fix for intermittent errors we were seeing on the GitHub API. We've been monitoring this and see no more errors. This means GitHub triggered check runs should no longer be in a "hanging state".

Intermittent issues on GitHub triggered check runs

We are seeing intermittent 404's from GitHub on our CI/CD triggers. This can result in your GitHub triggered check runs failing because we cannot access the relevant deployment data on your GitHub PR.

Sadly, this also happens on re-runs. We are looking into a fix to mitigate this.

Dashboard interface improvements

We just released a few tiny changes to our Dashboard that we think will improve your experience.

Easy toggles to Mute & Deactivate checks

You can now easily mute or deactivate any check right from the dashboard.

2020-07-06 16.31.05.gif

Your check status at a glance

Check pages now include handy flags to see which regions the check is running from, as well as status banners that inform you about your check status at a glance.

Screen Shot 2020-07-06 at 16.44.45.png

Opsgenie integration

We just added Opsgenie to our alerting integrations! πŸŽ‰

Now Checkly can create and resolve alerts in your Opsgenie team and integrate into your on-call workflows.

opsgenie.png

We are excited to enable you to use this integration today, in the meantime Checkly is working together with Opsgenie to expand the integration. You can expect additional features coming soon.

Opsgenie is available to all plans above the Developer plan. Learn

how to integrate Checkly with your Opsgenie team in our docs πŸ‘ˆ

[retro active] short API outage on Friday

On Friday, from 2020-06-12 10:38 UTC to 2020-06-12 10:44 UTC, Checkly API was down for 6 minutes. Functionalities like adding, editing and removing checks were affected. None of the monitoring services were affected. Since our Dashboard also uses the same API, this also affected the Dashboard.

The issue was caused by our API instance going into a crash loop when an exception occurred due to an error in our error handler. Ironic, right?

Bonus Fact We were alerted by this outage by Checkly itself. This wasn't the first time this happened but this never gets old for us here at Checkly.

Zrzut ekranu 2020-06-12 o 13.04.55.jpg Time in the screenshot is in CET.

Strict null assertions in API checks

We just added two new assertion types that you can use with JSON body API checks.

API_check___assertions.png

  1. Is null will assert that value is strictly equal to null
  2. Not null will assert that value is strictly not equal to null

You can use the new assertions combined with JSON path expressions to add better validation to your API check response.

Cape Town, South Africa and Milan, Italy regions now available

We just added the Cape Town, South Africa πŸ‡ΏπŸ‡¦ and Milan, Italy region. This is pretty epic, especially for the underserved African region. This enabled for all plans and available now.

regions.png

[retro active] scheduling outage for browser checks

Monday 18 May we had an outage in processing browser check results between 15:44 PM UTC and 20:38 PM UTC. This was caused by a bug in our release and deployment software. API checks were not impacted.

This outage had the following consequences:

  • No browser results were stored in our database from that period
  • You will not find browser check results in your dashboard from that period
  • No alerts were triggered for failing browser checks, as these rely on the results being processed.

We published a full post mortem on the outage detailing the root cause and most importantly our actions to prevent this in the future. In a nutshell:

  • Our own monitoring and alerting failed here, causing the outage to last much longer than needed.
  • The bug itself was minor and easily and quickly rectified.
  • We are putting three distinct measures in place to stop this from happening again.

On a more personal note: it is bitter that this outage was effectively created due to the engineering team working on reliability and better testing and releasing procedures. The code changes necessary for this sometimes have bugs, like all code.

Tim, CTO & co-founder

Find the detailed post mortem here: https://blog.checklyhq.com/post-mortem-outage-browser-check-results-alerting/