Real-Time Analysis provides a chronological timeline of the events and alerts that have been issued to a particular check. It is designed for users with access to learn the state of an outage, develop a timeline for the outage, and view technical details for each step in the process.
- Where to Find Real-Time Analysis
- Root Cause Analysis
- Root Cause Analysis Examples
- Where to Find Traceroute
- Real-Time Analysis Details
- Technical Outage Data
- Probe Server Data
- Load Real-Time Check Status
- Location Status
- Recent Alerts Per Location
Where to Find Real-Time Analysis
It is possible to view this screen from multiple locations in the Uptime.com interface:
From the Dashboard, under Latest Alerts: click View Details next to the alert you wish to view, then scroll to the bottom of the outage popup. Click the Real-Time Analysis button.
You can also click a check card from your Dashboard, or view any check's Report to find a link to Real-Time Analysis:
From Monitoring > Checks: Click Actions > Analysis:
Real-Time Analysis is also available as a direct link from the alert emails issued to your contacts by Uptime.com. Click the link for “Real Time Analysis” to see the Analysis for the check in question.
Clients using the Uptime.com API can also request this information for delivery via the `/checks/{pk}/analysis` endpoint.
Please note: Real-Time Analysis is not available for RUM and Group Checks.
Root Cause Analysis
API and Transaction checks have a root cause analysis tool used to identify the specific step where failure occurred to assist in diagnosing the problem. You can also access root cause analysis from the alert email issued to your check’s contacts.
Root cause analysis provides detailed check results, including response time metrics for each step the check was able to complete prior to failure, as well as screenshots and any notes that you have saved corresponding to the check.
Root Cause Analysis Examples
Transaction and API Check Alert Details also include a screenshot of what Uptime.com encountered, at which specific step, and technical data on the expected output. Note in this screenshot text is expected for an element in Step 6 that does not exist:
Scroll down to review the Check Results and the Browser Console log.
Check Results displays the steps that the API or Transaction Check took, as well as technical details on the failure.
In this API check example, we see the probe server is reporting a timeout error, which is causing failure at Step 1:
Viewing Headers
Both API and Transaction check types allow users to view header information when a request is sent. Run a test check and click the “i” icon:
This example illustrates returned data for a failed API check:
Where to Find Traceroute
Uptime.com can run a traceroute from a specific probe server location (including Private Location probe servers). This feature is enabled by default for all Premium accounts, and is available as an add-on to other plans. Traceroute is useful when you need to detect connection anomalies between Uptime.com and your service.
Scroll to the bottom of your Real-Time Analysis page for the check in question, accessed via the instructions above, and you will see a dropdown menu that contains the probe server locations assigned to your check. Select the location to run a traceroute from and click Submit.
Traceroute runs are limited to 10 per minute, 50 per hour, and 100 per day per account.
Please note: Traceroute functionality is not available for check types that have a specified frequency, including: WHOIS/Domain Expiry, SSL Certificate checks, Domain Blacklist, and Malware/Virus checks, as well as Custom checks (heartbeat and webhook).
Real-Time Analysis Details
The Real-Time Analysis view contains summary details of each outage related to the last 100 alerts, and links to specific alerts issued when an outage occurred.
Technical Outage Data
Users will find detailed technical data on where the outage or failure occurred. The status of the check signals when and how the check was triggered, and helps explain the conditions that created or triggered the alert. This includes response time, response codes, and links to the specific alert. Where specified, the data will designate whether the response matched the user-defined value, and whether the check failed as a result.
Probe Server Data
Additionally, users will find probe server data that displays when an outage has occurred (whether an alert was sent for downtime or not). This view may show a probe server outage that did not trigger an alert; for example, a single probe server failing on a check that has a sensitivity of two. It is also possible to view the dates a probe server was assigned.
Load Real-Time Check Status
Uptime.com includes a function to check the status of the probe servers used in the check you are analyzing. We encourage users to utilize this real-time check status button to investigate the status of a probe, especially when confirming the status of a check.
Location Status
The first column tells you which Location the probe servers are monitoring from, and provides the IP address for each Assigned Probe Server for your check. The State tells you whether the probe server is reporting as OK or Critical. The check will issue a DOWN alert when the number of probe servers reporting CRITICAL matches or exceeds the number you designated as the check’s sensitivity.
Each probe server lists a Last Alert Date, in your account's preferred timezone, at which the last alert occurred for that probe server. Last Alert Details provide a brief summation of the technical data collected during the last alert. Scroll to Recent Alerts Per Location for a full report on outages the check has experienced.
This section provides the current status for each location configured for a check. Uptime.com can be used to confirm a check status with the Real Time Check Status button at the top of the Real-Time Analysis screen. This button returns the results of the last time the check ran from each location.
Recent Alerts Per Location
The recent alert details table includes the date, along with some additional information:
- Location - The probe server region where a state change occurred.
- Probe Server - The specific probe server reporting a change in status. Hover over a Probe Server to see the IP address for the probe in question.
- Location's State Changed To - Either OK, WARNING, CRITICAL, or UNKNOWN. The state that the probe server entered as a result of the alert.
- Details - The specific alert details each probe server is reporting.
- # Locations Down - The number of probe server locations down at the reported time. If your check uses 5 probe servers, and each server reports an outage, you will see the locations DOWN begin at 0 and progress to 5 as each location reports an outage. This number will decrease to 0 as probe servers return to UP status.
- Overall Check State - The state of the check at the timestamp reported (Either UP or DOWN).
When Check State changes to DOWN, recent alerts will contain a link to Alert Details for probe server outages that met your check’s sensitivity settings. Locate the date and time that the alert was issued, then click Alert Details from the server that triggered the alert.
The alert details provide access to the alert data that was generated at the time of failure.
Discrepancies Between Check State and State
When a check is returning to UP status, the Check State will not return to UP until the State of all probe servers is OK. If your Check State is down, be sure to review the State of each probe server as you conduct your troubleshooting.
We also suggest a review of your check’s sensitivity, which is the number of locations that must report State as CRITICAL before Check State is changed to DOWN.
Comments
0 comments