Check statistics & metrics got an update

We just released an update that puts some polish and much wanted new options on the Check stats & metrics page.

check_stats.png

  • You can now use the date search box to select any date range you wish. Date ranges starting more than 30 days ago have a resolution of 1 hour.

  • You can use the forward and backward buttons to hop an hour in each direction.

  • The graph now shows clear markers on each failed check. Clicking the marker takes you to the failed check's detail page.

  • All uptime metrics are reported with Five Nines notation now.

[resolved] Checkly API outage

We were experiencing some issues with our API due to a maintenance / upgrade process.

The Checkly web application and dashboards were showing errors and might not have been available.

This outage lasted from 13:08 to 13:11 CET.

New Alert History & navigation changes

You can now find all things related to day-to-day monitoring and check management on the new sidebar menu on the left.

Dashboard management has also moved from the homepage to its own dedicated menu item.

All things related to your account (billing, plans, teams) are still in the "old" menu on top right.

Furthermore, we released the new Alerts History overview: a timeline with all your alerts across all checks.

Check___alerts.png

Bugfixes, tweaks and a new graph.

Over the last two weeks we released some iterative updates:

  • We added a nice visualization of request timing to all API check results. Inspired by the Chrome Developer Tools.

API_check___request.png

  • We pushed an important bugfix where timed out Browser checks would in some cases not report this timeout correctly.

  • Your in app dashboard is updated directly via websockets now. This means new checks will instantly be run and the results visible. This caused some confusion for new users. Also, performance for busy dashboards will be better as we now only update state when needed, not in bulk for all. We will introduce websockets to the public Dashboards too.

  • Your dashboard's state is now bookmarkable and linkable as we added the necessary filtering, tagging and pagination option to query parameters

  • TV-mode dashboards are now just called "Dashboards". Why complicate things?

Dashboard.png

Puppeteer Recorder now records screenshots

We just released v0.7.0 of Puppeteer Recorder, our handy Chrome extension that makes recording Puppeteer scripts a breeze.

This new version adds the option to take screenshots, either of the current page or of a clipped portion of the page.

Just right click or use the Cmd+Shift+A shortcut!

context_menu.png

Read more about Puppeteer Recorder and how to use it right here in our docs

Ongoing: Slack alert delivery errors

Due to a system wide outage at Slack, some Slack alerts are not being delivered. We can see timeouts and errors on our backend happen intermittently.

Regretfully, there is nothing much we can do. Please follow https://status.slack.com/ for any updates on this issue.

Slack and Email alerts are now more actionable

Based on customer feedback, we added a ton of extra data points to our Slack and email alerts, making them way more actionable!

Slack_alert.png

For Slack, the message is enriched with:

  • Status code, response time & datacenter location.
  • A block detailing the assertions.
  • The response body if any.
  • A request error in case the request could not be fired at all (think DNS or other connectivity issues).

Email alerts now also show all assertions and where applicable request errors!

Retroactive: 10 minute outage / delay

We just experienced a roughly 10 minute outage from 11:56 to 12:05 CET. The web application was slow or not available and alerts were delayed by up to 5 minutes.

The cause for this outage was a routine maintenance update to our database colliding with a long running backup job.

The maintenance concerned the alteration of a not used foreign key constrained on a table. Regretfully, read locks as set by the backup job prohibit such alteration, causing the alter query to hang up on a table serving our check results.

Once we realized this we killed the query and service was restored. As of now such maintenance will not be performed during our daily backup window.

JSON path, regexes, GraphQL and more

We just released some nice new features!

  • Assertions now have JSON path support. Easily drill down into JSON response bodies and grab the data you need. Examples in our docs ‚Äč

  • We added Regular Expressions to drill down into regular response bodies and headers. More examples

  • There is now a Contains matcher to loosely match targets.

  • GraphQL queries are now first class API request citizens.

  • We show you a full request log for API checks, including setup & teardown scripts logs for much easier debugging.

  • Make your code prettier by using the Prettier button in all code editors.

Ongoing: Puppeteer 1.15 stability and rollback

We noticed elevated error rates on EU region browser check runners with our Puppeteer 1.15 upgrade from yesterday. All other regions seem not affected.

To investigate this, we rolled back the Frankfurt, Ireland, London and Paris regions to the previous release, version 1.11. We will keep monitoring. This should not impact any running browser checks.