Creating Responsive Escalations for Downtime Alerting

Escalations allow users to receive a notification that a check has stayed down longer than a designated period of time. An escalation only applies to downtime that occurs after the escalation was created.  The role of escalations is to issue additional alerts when a downtime event’s severity increases. Using escalations, someone on your team best equipped to provide rapid response can act. 

To skip to a specific section of the "Contacts, Integrations, and Alert Escalations" video, click the links with the youtube-logo-final.svg!

Escalations should send a new alert to the same team at intervals you define, thus escalating the sense of urgency first with email, then with SMS, then with phone calls. In short, escalating an incident gives both a sense of urgency, and those with the highest permission levels the data they need to act quickly after known solutions have been explored.

This guide will provide best practices on escalations to provide a more thorough alert system.

Table of Contents

Responsive Escalations At A Glance

  • Create Escalations for individual contacts.

  • Create integrations with a number of providers for seamless notifications.

  • Set On-Call Hours to determine when a contact should be notified.

Creating Smarter Escalations

youtube-logo-final.svg Skip to 3:32


Smarter escalations allow your team to get alerts sent to various locations based on a sense of urgency:

  • Create a Contact for each “tier” of maintenance/response (i.e., first response, higher permissions, admin, etc.)
  • Establish multiple methods of communication for a single person or team (Contact) that trigger as the outage time extends.  For example: create a Contact for Tier I SMS, Tier I email to escalate after 15 minutes, and then Tier I voice call when the 30-minute mark has passed.
  • Integrate with your existing services (We provide push notifications including metrics and alerts to 18 providers, including customized webhooks for your internal configuration)
  • Create multiple checks (Reduce false positives, confirm outages faster and perform root cause analysis)

Escalations 3.png

An example of smart escalations in practice. Admin receives notification of extended downtime after 2 hours pass, but Tiers I and II receive notifications several times before that happens.

The keys to smart escalation are a robust monitoring system, and designated contacts for specific levels of response with an escalating sense of urgency.

Please note: it is important to disable any Do Not Call functionality on your mobile device when using the SMS or Phone Alert system with to ensure that time-sensitive escalations are not missed.

Repeating Escalations

youtube-logo-final.svg Skip to 3:47


Once setting up an escalation, you are provided the ability to also set the number of repeated times the escalation should run.

The retry interval will match the configured outage duration for the contact to be notified. 

These repeat escalations can also be configured for subsequent layers of your escalation policy. 

Please Note: The maximum number of Repeat Times is 50. 

Repeating Escalations.png

On-Call Hours


On-call hours will send an alert to a contact when that contact is On-Call in the designated timezone. If downtime occurs outside of those on-call hours, the contact will receive an “Up” notification when on-call hours resume on the next day. It’s best to ensure a contact is designated as Always on Call (the default), or that schedules overlap so someone will always receive a downtime alert in real-time, where response time matters.  

Improve Alert Response Rate


Smart escalations:

  1. Reduce response time
  2. Ensure recipients get data at a time and in a place to put that data into action
  3. Take into account on-call hours of staff and personnel so alerts are not lost when someone is off the clock

After you have defined tiers of response, it's necessary to tell where to send each escalation. A check may have more than one escalation assigned to it.

You can escalate a severe outage to a second external data source, such as Slack or your internal dashboard, in case the first outage report delivered via email is missed.

Want to see our checks in action? Check out our youtube-logo-final.svg YouTube Library for more!

Was this article helpful?
3 out of 4 found this helpful



Article is closed for comments.

Have more questions?
Submit a request
Share it, if you like it.