Uptime.com monitors your infrastructure for failure, notifying you the moment something goes down. Some use cases require more refined conditions under which an alert is issued. Extended Options provide additional options that control how the request is issued, sensitivity and retries, validation options, and other technical details.
When a check fails, IT personnel could waste effort researching false positives that didn’t signify a real problem. Uptime.com’s Extended Check Options provide a number of methods to reduce the likelihood of receiving alerts for failed checks that aren’t problematic and better controlling the conditions under which a check functions.
Table of Contents
- Extended Check Options At A Glance
- Sensitivity, Retries, and Timeout
- IP Version
- Setting Target SLA% & Response Time SLA
- Checker Version
- Escalations
- Maintenance
- Ignoring Alerts
Extended Check Options At A Glance
Adjust the sensitivity, number of retries, and timeout to avoid false positives
Create escalations to ensure the right parties are notified when downtime exceeds a set duration.
Set Maintenance schedules to ensure your downtime percentage is not affected by known occurrences.
Setting Check Conditions
Every check’s basic configuration and required fields are set under the General tab.
Advanced options are configured under tabs such as Execution & SLA and Validation & Security. These tabs are dependent on the check type.
Advanced options are configured under tabs such as Execution & SLA. These tabs are dependent on the check type.
Sensitivity, Retries, and Timeout
Sensitivity
Sensitivity is the number of Locations that can fail before a check is considered failed. We recommend that every user monitor from at least three Locations, then set the Sensitivity to a value that matches or accounts for the majority of those Locations. This way, Uptime.com provides the best balance between alert speed & accuracy, while avoiding the false positives that low sensitivity could create.
Note: The number of locations that can be selected for a single check is dependent on your Probe Server Locations plan.
How Sensitivity Works
Going DOWN (failing) → The check status changes to DOWN when the number of failing locations meets or exceeds your sensitivity threshold. Example: 4 locations, sensitivity = 2. In this example, the check goes DOWN when 2 or more locations fail.
Coming back UP (Recovering): Recovery works differently depending on your sensitivity setting.:
- If sensitivity is 1 or 2, all locations must be UP for the check to recover.
If sensitivity is 3 or higher & sensitivity (3) = location (3), 1 location can be DOWN and the check will still recover. Only 2 locations need to recover to change the check back to UP.
Examples of Up/Down Logic
3 locations, sensitivity = 2
UP → DOWN: Check fails when 2 locations are DOWN (meets the sensitivity threshold of 2)
DOWN → UP: Check recovers when ALL 3 locations are UP
4 locations, sensitivity = 4
UP → DOWN: Check fails when 4 locations are DOWN (meets the sensitivity threshold of 4)
DOWN → UP: Check recovers when 3 locations are UP (all but one)
5 locations, sensitivity = 3
UP → DOWN: Check fails when 3 locations are DOWN (sensitivity threshold of 3)
DOWN → UP: Check recovers when 3 locations are UP (all but two)
If a single location should fail, and remain DOWN for 3 days, a warning will be sent to the contacts configured for the check. This will not register as downtime for the check, or place the check in DOWN status. We recommend reviewing single failed locations that have remained DOWN for 72 hours and resolving them to prevent downtime should another single location fail.
Retries
The number of retry attempts determines how many times a check should be re-run before a location is considered down. The default setting is 2, but Uptime.com allows users to choose from 0-3 attempts. We recommend using 2 retries for fast alerting that avoids false positives.
Please note: The retry intervals for API and Transaction checks are two minutes, as opposed to one minute for other checks.
Please note: If you decide to select 0 as the option, then you would need to have at least one location added. Furthermore, by setting to 0, you will receive the following messages:
- We recommend using at least 3 locations per check
- We recommend that check sensitivity is set to less than the number of locations
- We recommend using 2 retries for fast alerts times that avoid false positives
The retry intervals for API and Transaction checks are two minutes, as opposed to one minute for other checks.
Setting retry attempts and sensitivity can be done in bulk.
Timeout
Users can also designate a Timeout, measured in seconds and limited to a maximum value of 60, to further control when and how alerts are issued. A timeout error typically signifies a problem with the connection to a specific function of a website. There may be too many users attempting to access a single resource, or a localized outage affecting connection time. Use of the Timeout (available only for HTTPS, API and Transaction checks) provides the first indication, with technical data, that there is a problem connecting to your website that requires investigation.
Please note: Timeout is not available for Real User Monitoring, Group Checks, Malware/Virus, SSL Certificate, WHOIS/Domain Expiry, Ping ICMP, SSH, TCP/UDP, POP/IMAP/SMTP, and Blacklist checks. Global check timeout is limited to the total run-time of a check.
IP Version
All checks (except for Group checks, API, and Transaction checks) default to IPv4 for connections unless IPv6 is specified. IPv6 is gaining in popularity as more routers and consumer devices utilize the addressing scheme. In specific instances, such as monitoring uptime to an interconnected device or a specific usage of API, it’s important to utilize an IPv6 address. To use any available address, leave this option on Any.
Setting Target SLA% & Response Time SLA
Target SLA%
Target Uptime SLA % indicates the minimum SLA% your check needs to meet for SLA accountability. Our default Target SLA% value is 99.0000 for all check types.
Response Time SLA
Response Time SLA indicates the minimum average response time (in seconds) your check needs to meet to uphold your SLA accountability. Our default Response Time SLA is dependent upon which check type is selected.
Updates to Uptime SLA% and Response Time SLA can be done in bulk.
Checker Version (Http(s) Check Only)
There are two HTTP(S) Checker Versions available for use. Newly created HTTP(S) checks will default to the latest version.
Please Note: The HTTP(S) check will default to the newest version. If HTTP is specified in the provided URL, the checker will not verify certificates.
V1.0 Legacy Version
The legacy HTTP(S) checker does not support certificate verification.
V2.0 Curl Version
Version 2.0 is based on curl with support for SSL and TLS certificate verification. V2.0 also supports newer technologies such as HTTP/2, SSL v 3, and chunked content.
Escalations
Escalations allow users to receive a notification that a check has stayed down longer than a designated period of time (from one minute up to several hours). Set your escalation rules by navigating to the Escalations tab when creating or editing a check. Use the Escalation Rules form to designate the amount of time that must pass and who will be notified. You can escalate any check type, with a wide range of options available as to how your escalation will work.
For thorough information on creating and using Escalations, see our dedicated document here.
Maintenance
Maintenance windows are used to temporarily suspend alerting and uptime calculations while a check is expected to experience outages. Maintenance Windows can be set on an ad-hoc basis or scheduled for the future based on expected downtime. If a check is expected to experience downtime due to scheduled maintenance to systems or other connectivity issues, we recommend using Maintenance Windows to mitigate unnecessary alerts or impacts to SLA calculations as opposed to pausing the check. Using Maintenance Windows allows for exact timing and reduces the chances of human error.
For thorough information on creating and using Maintenance Windows, see our dedicated document here.
For information on setting Maintenance Windows specifically for Status Pages, see our dedicated document here.
Ignoring Alerts
Occasionally, it’s necessary to ignore an alert because of a false positive or which was caused by some action in-house that maintenance windows did not account for. You can manually ignore the alert, which has several important implications.
Ignored alerts:
- Are logged in Alert history, but not in the Alert log of reports
- Do not factor into uptime calculations or reports
- May not appear in the Latest Alerts for custom dashboards dependent on settings
- Downtime resulting from ignored alerts will remain visible on check cards in your dashboard.
- Do not change the state of a check (A down check will remain down even if alerts are ignored)
Navigate to Alerts, then click the Actions menu next to the alert. Click Ignore this Alert to manually ignore the alert, which fades the colored status indicator on all screens the alert is depicted on.
Note that a check that is experiencing downtime with an ignored alert will display DOWN on the internal Check Report page, as seen here:
Comments
0 comments