A check configured in Uptime.com can utilize multiple probe servers at intervals as short as 60 seconds. Occasionally, you may find yourself receiving a downtime alert for a service, URL, or IP that you can verify as running. This page is dedicated to helping you determine what happened with the alert, why it was issued, and what you can do about it.
First, we’ll provide practical steps you can take from the time you receive the alert to verify its status on Uptime.com. This step tells you more about why Uptime.com issued the alert, and what it could mean for your site.
We’ll then investigate some other potential causes, and review where and how to locate alert data for the most accurate possible timeline of the incident.
Table of Contents
Identify and Diagnose
Your first instinct should be to review Real-Time Analysis for the error in question. You can access Real-Time Analysis for the error from the email alert that Uptime.com issued when the downtime was detected.
Note that Transaction and API check users will have access to Root Cause Analysis, which explains in detail exactly where the check failed with screenshots and a browser console log with more technical details.
Is the Outage Verifiable?
Each check has an adjustable number of retries and sensitivity that dictate when and how the check times out or alerts you of downtime. We recommend a setting of 2 for both Retries and Sensitivity to balance fast alerting with minimal false positives.
We recognize that outages of only one or two minutes can be difficult candidates for root cause analysis. The error occurred somewhere between our probe servers and your server. Most likely candidates include local issues, firewalls, blacklists, timeouts, and load balancer issues.
We suggest using any server response codes, or specific alert data based on the check you’ve configured, to diagnose the issue.
CRITICAL - Socket timeout after 40 seconds
When encountering this error message, it does not necessarily mean the check is down. It simply indicates that our user agent was unable to access the server within 40 seconds. This may point to a non-responsive server or a network issue. To troubleshoot this, it is recommended to review the logs for the WAF/CDN/Firewall at the time of the alert.
Whitelist Probe Server Locations
If you are using a load balancer or firewall to help secure your server, we suggest whitelisting Uptime.com probe server locations as required for your monitoring system. You may find our list of servers when you click Support > Probe Servers.
Navigate to the Probe Servers that correspond to the locations you’re monitoring from and whitelist the IPs you find there.
Review the Uptime.com Alert History
There are multiple methods to access alert data for a specific check. Please note alert data is different from the Uptime Report, which is a measure of downtime and associated incidents for a given period of time.
Please note that your account type determines how long your alert data is retained. Please see Billing>Account Usage, or Billing>Subscription for History Retention limits on your account.
From your Dashboard
Locate the recent alert and click View to see the alert details.
From the Check Screen
Click on Monitoring > Checks, then click Actions > Report. For more information, see our article on Uptime Reports.
From Alert History
Click Reporting > Alerts to see a list of all recent alerts. Locate the alert you wish to review, then click Actions > Details to review the alert details.
Comments
0 comments