Uptime.com monitors your infrastructure for failure, notifying you the moment something goes down. Sometimes, the failure of a check does not signify a real problem is occurring and you need to refine the conditions under which an alert is issued. DNS checks, for example, will query multiple servers that may go down at any given moment. Some systems, like DNS, are designed with multiple layers that safeguard us from these singular failures.
When a check fails, IT personnel could waste effort researching false positives that didn’t signify a real problem. Uptime.com’s Advanced Check Options provide a number of methods to reduce the likelihood of receiving alerts for failed checks that aren’t problematic and better controlling the conditions under which a check functions.
Table of Contents
- Sensitivity and Timeout
- IP Version
- Maintenance Windows
- Ignoring Alerts
- Tips and Ideas
Setting Check Conditions
Once you have created a check, there are a number of options under Advanced, Escalation and Maintenance that control when and how a check alert is issued. Let’s begin with the Advanced tab, which is used to set your Sensitivity and IP Version.
Sensitivity is the number of Locations that can fail before a check is considered failed. We recommend that every user monitor from at least three Locations, and then sets the Sensitivity to a value that matches or accounts for the majority of those Locations. This way, Uptime.com provides the best balance between alert speed & accuracy, while avoiding the false positives that low sensitivity could create.
Users can also designate a Timeout, measured in seconds, to further control when and how alerts are issued. A timeout error typically signifies a problem with the connection to a specific function of a website. There may be too many users attempting to access a single resource, or a localized outage affecting connection time. Use of the Timeout (available only for HTTPS, API and Transaction checks) provides first indication, with technical data, that there is a problem connecting to your website that requires investigation.
Configuring the Number of Retry Attempts
The number of retry attempts determine how many times a check should be re-run before a location is considered down. The default setting is 2, but Uptime.com allows users to choose from 1-3 attempts. We recommend using 2 retries for fast alerting that avoids false positives.
Please note: The retry intervals for API and Transaction checks are two minutes, as opposed to one minute for other checks.
Setting retry attempts and sensitivity can be done in bulk.
All checks, except for API and Transaction checks, default to IPv4 for connections unless IPv6 is specified. IPv6 is gaining in popularity as more routers and consumer devices utilize the addressing scheme. In specific instances, such as monitoring uptime to an interconnected device or a specific usage of API, it’s important to utilize an IPv6 address. To use any available address, leave this option on Any.
Escalations allow users to receive a notification that a check has stayed down longer than a designated period of time (from one minute up to several days). You can choose Escalations from the check screen to designate the amount of time that must pass and who will be notified. You can escalate any check with a wide range of options available as to how your escalation will work.
See our article on creating smart escalations to see this in action.
Escalations can be configured in bulk.
During maintenance, failed checks will be ignored in uptime calculations. Alerts are logged but not sent. You can review alerts detected during maintenance from the "Alert" screen, which will display ignored alerts as faded.
When a check is in Maintenance state, Uptime.com will stop recording alerts for the check until you re-enable it. During this period, Uptime.com will not log data for response or downtime. Statistics will not show any change of status during the time paused.
Maintenance is the preferred alternative to pausing a check, because maintenance only ignores alerts for the specified period and requires less human input to return the check to a normal state when using maintenance windows. Maintenance can be set to:
- No Maintenance Window when no maintenance has occurred (This is the default setting)
- Under Maintenance Now, where all failed checks are ignored until the feature is returned to No Maintenance Window
- Use Maintenance Schedule where you can Set a Maintenance Window.
First, click the Maintenance tab from your Check screen. Click the option to Use Maintenance Schedule and then designate your maintenance window. You will need the following:
- Day or days of the week maintenance will be performed
- Block of time (set with a sliding scale) indicating number of hours needed to perform maintenance
Please note that you can use Add to Schedule to schedule multiple maintenance periods. Use this functionality for maintenance that occurs on different days, or to designate maintenance that occurs past midnight within a timezone. Please note: all maintenance windows are set based on your account's preferred timezone.
It is possible to set maintenance windows in bulk.
Occasionally, it's important to ignore an alert that was a false flag or caused by some action in house that maintenance windows did not account for. For these instances, you can manually ignore the alert.
Navigate to Reports>Alerts, then click the Actions menu next to the alert. Click Ignore this Alert to manually ignore the alert, which fades the colored status indicator on all screens the alert is depicted on.
Let’s explore these options with a detailed use case that better illustrates these options.
These advanced options are intended to add improved monitoring and reporting functionality and to better control the conditions under which reports are received.
Beginning with Sensitivity, you might consider multiple checks designed to monitor specific infrastructure in specific regions. If your company is International, checks that are designed only for the UK, Asia, or the US would provide location-specific data about which regions are out.
Escalations are useful for tracking significant errors related to infrastructure that might be prone to small outages. Our article on smart escalations discusses how to structure your contacts to receive notifications as they occur.